mirror of
https://github.com/ganelson/inform.git
synced 2024-07-08 18:14:21 +03:00
286 lines
38 KiB
HTML
286 lines
38 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<title>3/idn</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<meta http-equiv="Content-Language" content="en-gb">
|
|
<link href="inweb.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
</head>
|
|
<body>
|
|
|
|
<!--Weave of '4/nw' generated by 7-->
|
|
<ul class="crumbs"><li><a href="../webs.html">★</a></li><li><a href="index.html">words</a></li><li><a href="index.html#4">Chapter 4: Parsing</a></li><li><b>Numbered Words</b></li></ul><p class="purpose">Some utilities for handling single words referred to by number.</p>
|
|
|
|
<ul class="toc"><li><a href="#SP1">§1. Comparisons</a></li><li><a href="#SP3">§3. Correct use of text substitutions</a></li><li><a href="#SP5">§5. Casing and sentence division</a></li><li><a href="#SP8">§8. Dequoting literal text</a></li><li><a href="#SP9">§9. Dictionary words</a></li></ul><hr class="tocbar">
|
|
|
|
<p class="inwebparagraph"><a id="SP1"></a><b>§1. Comparisons. </b>Comparison of the word at a given numbered position against some known
|
|
word, say "rulebook", must be done very quickly. The whole point of the
|
|
vocabulary bank identifying each distinct word was to enable this to be
|
|
done by a single comparison of pointers: and to avoid the overhead of a
|
|
function call, we perform this with macros.
|
|
</p>
|
|
|
|
|
|
<pre class="definitions">
|
|
<span class="definitionkeyword">define</span> <span class="identifier">compare_word</span><span class="plain">(</span><span class="identifier">w</span><span class="plain">, </span><span class="identifier">voc</span><span class="plain">) (</span><span class="functiontext">Lexer::word</span><span class="plain">(</span><span class="identifier">w</span><span class="plain">) == (</span><span class="identifier">voc</span><span class="plain">))</span>
|
|
<span class="definitionkeyword">define</span> <span class="identifier">compare_words</span><span class="plain">(</span><span class="identifier">w1</span><span class="plain">, </span><span class="identifier">w2</span><span class="plain">) (</span><span class="functiontext">Lexer::word</span><span class="plain">(</span><span class="identifier">w1</span><span class="plain">) == </span><span class="functiontext">Lexer::word</span><span class="plain">(</span><span class="identifier">w2</span><span class="plain">))</span>
|
|
<span class="definitionkeyword">define</span> <span class="identifier">compare_words_cs</span><span class="plain">(</span><span class="identifier">w1</span><span class="plain">, </span><span class="identifier">w2</span><span class="plain">) (</span><span class="identifier">Wide::cmp</span><span class="plain">(</span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">w1</span><span class="plain">), </span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">w2</span><span class="plain">)) == 0)</span>
|
|
</pre>
|
|
<p class="inwebparagraph"><a id="SP2"></a><b>§2. </b>We can also, more slowly, perform a direct string comparison. If carried
|
|
out on the original, raw, text, this will be case sensitive — which is
|
|
usually wrong for Inform purposes. On the treated text, however, we are
|
|
comparing a case-normalised version of the original word, which is likely
|
|
to be safely case insensitive comparison, provided that the content of <code class="display"><span class="extract">t</span></code>
|
|
is also normalised.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Word::compare_by_strcmp</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">w</span><span class="plain">, </span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">t</span><span class="plain">) {</span>
|
|
<span class="reserved">return</span><span class="plain"> (</span><span class="identifier">Wide::cmp</span><span class="plain">(</span><span class="functiontext">Lexer::word_text</span><span class="plain">(</span><span class="identifier">w</span><span class="plain">), </span><span class="identifier">t</span><span class="plain">) == 0);</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Word::compare_raw_by_strcmp</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">w</span><span class="plain">, </span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">t</span><span class="plain">) {</span>
|
|
<span class="reserved">return</span><span class="plain"> (</span><span class="identifier">Wide::cmp</span><span class="plain">(</span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">w</span><span class="plain">), </span><span class="identifier">t</span><span class="plain">) == 0);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::compare_by_strcmp is used in 4/bn (<a href="4-bn.html#SP7">§7</a>).</p>
|
|
|
|
<p class="endnote">The function Word::compare_raw_by_strcmp appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP3"></a><b>§3. Correct use of text substitutions. </b>If a "word" is going to be quoted literal text, then it has to use the
|
|
characters <code class="display"><span class="extract">[</span></code> and <code class="display"><span class="extract">]</span></code> in a matched way, and without nesting them. The
|
|
following verifies that.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">These rules are quite strict. It could be argued that nested brackets should be
|
|
allowed, allowing comments in text substitutions, but the result would be hard
|
|
to read and tricky for the user interface applications to syntax-colour.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Word::well_formed_text_routine</span><span class="plain">(</span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">fw</span><span class="plain">) {</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">i</span><span class="plain">, </span><span class="identifier">escaped</span><span class="plain"> = </span><span class="identifier">NOT_APPLICABLE</span><span class="plain">;</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="identifier">i</span><span class="plain">=0; </span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] != 0; </span><span class="identifier">i</span><span class="plain">++) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="constant">TEXT_SUBSTITUTION_BEGIN</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">escaped</span><span class="plain"> == </span><span class="identifier">TRUE</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="identifier">escaped</span><span class="plain"> = </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="constant">TEXT_SUBSTITUTION_END</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">escaped</span><span class="plain"> != </span><span class="identifier">TRUE</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="identifier">escaped</span><span class="plain"> = </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">escaped</span><span class="plain"> == </span><span class="identifier">NOT_APPLICABLE</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">escaped</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">escaped</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Word::perhaps_ill_formed_text_routine</span><span class="plain">(</span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">fw</span><span class="plain">) {</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">i</span><span class="plain">;</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="identifier">i</span><span class="plain">=0; </span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] != 0; </span><span class="identifier">i</span><span class="plain">++) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="constant">TEXT_SUBSTITUTION_BEGIN</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="constant">TEXT_SUBSTITUTION_END</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::well_formed_text_routine appears nowhere else.</p>
|
|
|
|
<p class="endnote">The function Word::perhaps_ill_formed_text_routine is used in 2/vcb (<a href="2-vcb.html#SP16">§16</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP4"></a><b>§4. </b>Not to be done lightly: the output can be enormous.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Word::log_lexer_output</span><span class="plain">(</span><span class="reserved">void</span><span class="plain">) {</span>
|
|
<span class="identifier">LOG</span><span class="plain">(</span><span class="string">"Entire lexer output to date:\</span><span class="plain">n</span><span class="string">"</span><span class="plain">);</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">i</span><span class="plain">=0; </span><span class="identifier">i</span><span class="plain"><</span><span class="identifier">lexer_wordcount</span><span class="plain">; </span><span class="identifier">i</span><span class="plain">++)</span>
|
|
<span class="identifier">LOG</span><span class="plain">(</span><span class="string">"%d: <%+N> <%N> <%02x>\</span><span class="plain">n</span><span class="string">"</span><span class="plain">, </span><span class="identifier">i</span><span class="plain">, </span><span class="identifier">i</span><span class="plain">, </span><span class="identifier">i</span><span class="plain">, </span><span class="functiontext">Lexer::break_before</span><span class="plain">(</span><span class="identifier">i</span><span class="plain">));</span>
|
|
<span class="identifier">LOG</span><span class="plain">(</span><span class="string">"------\</span><span class="plain">n</span><span class="string">"</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::log_lexer_output appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP5"></a><b>§5. Casing and sentence division. </b>Casing is only sometimes informative in English: for the first word in
|
|
a sentence, we expect to find an upper-case letter, so that there is no
|
|
easy way to tell the name of a person or institution from a common noun.
|
|
But in other cases an upper case initial letter is unexpected, and can
|
|
tell us something.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Word::unexpectedly_upper_case</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">wn</span><span class="plain"><1) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">compare_word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">-1, </span><span class="identifier">FULLSTOP_V</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">compare_word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">-1, </span><span class="identifier">PARBREAK_V</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">compare_word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">-1, </span><span class="identifier">COLON_V</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">isupper</span><span class="plain">(*(</span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">)))) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="functiontext">Word::text_ending_sentence</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">-1)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::unexpectedly_upper_case is used in 2/wa (<a href="2-wa.html#SP11">§11</a>), 4/prf (<a href="4-prf.html#SP53">§53</a>), 4/bn (<a href="4-bn.html#SP2">§2</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP6"></a><b>§6. </b>Is the word at <code class="display"><span class="extract">wn</span></code> in single quotes? Count the number at the ends.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Word::singly_quoted</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">wn</span><span class="plain"><1) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">p</span><span class="plain"> = </span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">qc</span><span class="plain"> = 0;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'\</span><span class="plain">'</span><span class="character">'</span><span class="plain">) </span><span class="identifier">qc</span><span class="plain">++;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">p</span><span class="plain">) > 1) && (</span><span class="identifier">p</span><span class="plain">[</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">p</span><span class="plain">)-1] == </span><span class="character">'\</span><span class="plain">'</span><span class="character">'</span><span class="plain">)) </span><span class="identifier">qc</span><span class="plain">++;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">qc</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::singly_quoted appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP7"></a><b>§7. </b>Does the word at <code class="display"><span class="extract">wn</span></code> appear to be a piece of quoted text which, because
|
|
it ends with punctuation, may also end the sentence which quotes it?
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Word::text_ending_sentence</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">) {</span>
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">p</span><span class="plain"> = </span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">p</span><span class="plain">[0] != </span><span class="character">'"'</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="identifier">p</span><span class="plain"> += </span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">p</span><span class="plain">) - 2;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'.'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'?'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'!'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="identifier">p</span><span class="plain">--;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'.'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">')'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[2] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'?'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">')'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[2] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'!'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">')'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[2] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'.'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">'\</span><span class="plain">'</span><span class="character">'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[2] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'?'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">'\</span><span class="plain">'</span><span class="character">'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[2] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">p</span><span class="plain">[0] == </span><span class="character">'!'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">'\</span><span class="plain">'</span><span class="character">'</span><span class="plain">) && (</span><span class="identifier">p</span><span class="plain">[2] == </span><span class="character">'"'</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::text_ending_sentence is used in <a href="#SP5">§5</a>.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP8"></a><b>§8. Dequoting literal text. </b>A utility for stripping double-quotes from literal text, along with
|
|
initial or trailing spaces inside those quotes.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Word::dequote</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">) {</span>
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">previous_text</span><span class="plain"> = </span><span class="functiontext">Lexer::word_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">dequoted_text</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">previous_text</span><span class="plain">[0] != </span><span class="character">'"'</span><span class="plain">) </span><span class="reserved">return</span><span class="plain">;</span>
|
|
<span class="functiontext">Lexer::set_word_raw_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">, </span><span class="functiontext">Lexer::copy_to_memory</span><span class="plain">(</span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">)));</span>
|
|
<span class="identifier">dequoted_text</span><span class="plain"> = </span><span class="identifier">previous_text</span><span class="plain"> + 1;</span>
|
|
<span class="reserved">while</span><span class="plain"> (*(</span><span class="identifier">dequoted_text</span><span class="plain">) == </span><span class="character">' '</span><span class="plain">) </span><span class="identifier">dequoted_text</span><span class="plain">++;</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">dequoted_text</span><span class="plain">) > 0) &&</span>
|
|
<span class="plain">(*(</span><span class="identifier">dequoted_text</span><span class="plain">+</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">dequoted_text</span><span class="plain">)-1) == </span><span class="character">'"'</span><span class="plain">))</span>
|
|
<span class="plain">*(</span><span class="identifier">dequoted_text</span><span class="plain">+</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">dequoted_text</span><span class="plain">)-1) = 0;</span>
|
|
<span class="reserved">while</span><span class="plain"> ((</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">dequoted_text</span><span class="plain">) > 0) &&</span>
|
|
<span class="plain">(*(</span><span class="identifier">dequoted_text</span><span class="plain">+</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">dequoted_text</span><span class="plain">)-1) == </span><span class="character">' '</span><span class="plain">))</span>
|
|
<span class="plain">*(</span><span class="identifier">dequoted_text</span><span class="plain">+</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">dequoted_text</span><span class="plain">)-1) = 0;</span>
|
|
<span class="functiontext">Lexer::set_word_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">, </span><span class="identifier">dequoted_text</span><span class="plain">);</span>
|
|
<span class="identifier">LOGIF</span><span class="plain">(</span><span class="identifier">VOCABULARY</span><span class="plain">, </span><span class="string">"Dequoting word %d <%w> to <%w>\</span><span class="plain">n</span><span class="string">"</span><span class="plain">,</span>
|
|
<span class="identifier">wn</span><span class="plain">, </span><span class="identifier">previous_text</span><span class="plain">, </span><span class="identifier">dequoted_text</span><span class="plain">);</span>
|
|
<span class="functiontext">Vocabulary::identify_word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="functiontext">Vocabulary::set_raw_exemplar_to_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::dequote appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP9"></a><b>§9. Dictionary words. </b>We take a wide Unicode string and compile an I6 dictionary word constant
|
|
to lodge the same text into the virtual machine's parsing dictionary.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">A legal I6 dictionary word can take several forms: it can be in single
|
|
quotes, <code class="display"><span class="extract">'thus'</span></code>, but only if it is more than one character long, since
|
|
<code class="display"><span class="extract">'t'</span></code> would be the character value of lower-case T instead. (Or it can be
|
|
double-quoted <code class="display"><span class="extract">"so"</span></code>, but only in grammar or properties; this usage is
|
|
deprecated and we avoid it.) Within the dictionary word, <code class="display"><span class="extract">^</span></code> is an escape
|
|
character meaning a literal single quote, and the notation <code class="display"><span class="extract">@{xx}</span></code> is an
|
|
escape meaning the character with hexadecimal value <code class="display"><span class="extract">xx</span></code>.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">Optionally, a dictionary word can end with a pair of slashes and then,
|
|
optionally again, markers to indicate that the word is (for instance) a
|
|
plural: thus <code class="display"><span class="extract">'newts//p'</span></code>. Using no markers, as in <code class="display"><span class="extract">'toads//'</span></code>, makes a
|
|
word equivalent to that without a marker, but avoids the single-letter
|
|
problem — so the preferred modern way to write a single-character I6
|
|
dictionary word is <code class="display"><span class="extract">'t//'</span></code>, and this is what the following routine does.
|
|
(Note the exceptional case where the word consists only of a <code class="display"><span class="extract">'/'</span></code>: here
|
|
we cannot write <code class="display"><span class="extract">'///'</span></code> because I6 reads this as <code class="display"><span class="extract">//</span></code> plus an invalid
|
|
marker <code class="display"><span class="extract">/</span></code>, and throws an error. We escape the single <code class="display"><span class="extract">/</span></code> to avoid this.
|
|
In all other cases there's no need to escape a <code class="display"><span class="extract">/</span></code>.)
|
|
</p>
|
|
|
|
<p class="inwebparagraph">Dictionary words with a literal <code class="display"><span class="extract">~</span></code> in are, as it happens, not parsable
|
|
by the Z-machine, but the code below — employing the <code class="display"><span class="extract">@{7E}</span></code>
|
|
escape — is in principle legal, and it does work on Glulx.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">Very long words can safely be truncated since the virtual machines do not
|
|
have indefinitely long dictionary resolution anyway, and we had better do
|
|
so because I6 rejects overlong text between single quotation marks.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Word::compile_to_I6_dictionary</span><span class="plain">(</span><span class="identifier">OUTPUT_STREAM</span><span class="plain">, </span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">p</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">pluralise</span><span class="plain">) {</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">c</span><span class="plain">, </span><span class="identifier">n</span><span class="plain"> = 0;</span>
|
|
<span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"'"</span><span class="plain">);</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="identifier">c</span><span class="plain">=0; </span><span class="identifier">p</span><span class="plain">[</span><span class="identifier">c</span><span class="plain">] != 0; </span><span class="identifier">c</span><span class="plain">++) {</span>
|
|
<span class="reserved">switch</span><span class="plain">(</span><span class="identifier">p</span><span class="plain">[</span><span class="identifier">c</span><span class="plain">]) {</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'/'</span><span class="plain">: </span><span class="reserved">if</span><span class="plain"> (</span><span class="identifier">p</span><span class="plain">[1] == 0) </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"@{2F}"</span><span class="plain">); </span><span class="reserved">else</span><span class="plain"> </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"/"</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'\</span><span class="plain">'</span><span class="character">'</span><span class="plain">: </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"^"</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'^'</span><span class="plain">: </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"@{5E}"</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'~'</span><span class="plain">: </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"@{7E}"</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'@'</span><span class="plain">: </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"@{40}"</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">default</span><span class="plain">: </span><span class="identifier">PUT</span><span class="plain">(</span><span class="identifier">p</span><span class="plain">[</span><span class="identifier">c</span><span class="plain">]);</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">n</span><span class="plain">++ > 32) </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">pluralise</span><span class="plain">) </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"//p"</span><span class="plain">);</span>
|
|
<span class="reserved">else</span><span class="plain"> </span><span class="reserved">if</span><span class="plain"> (</span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">p</span><span class="plain">) == 1) </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"//"</span><span class="plain">);</span>
|
|
<span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"'"</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Word::compile_to_I6_dictionary is used in 3/lxr (<a href="3-lxr.html#SP18">§18</a>).</p>
|
|
|
|
<hr class="tocbar">
|
|
<ul class="toc"><li><i>(This section begins Chapter 4: Parsing.)</i></li><li><a href="4-prf.html">Continue with 'Preform'</a></li></ul><hr class="tocbar">
|
|
<!--End of weave-->
|
|
</body>
|
|
</html>
|
|
|