1
0
Fork 0
mirror of https://github.com/ganelson/inform.git synced 2024-07-08 01:54:21 +03:00
inform7/docs/basic_inform/S-tt2.html

1163 lines
65 KiB
HTML
Raw Normal View History

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>S/tt</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-gb">
<link href="inweb.css" rel="stylesheet" rev="stylesheet" type="text/css">
</head>
<body>
<!--Weave of 'S/tt2' generated by 7-->
<ul class="crumbs"><li><a href="../webs.html">&#9733;</a></li><li><a href="index.html">basic_inform Template Library</a></li><li><b>Text Template</b></li></ul><p class="purpose">Code to support the text kind of value.</p>
<ul class="toc"><li><a href="#SP1">&#167;1. Block Format</a></li><li><a href="#SP2">&#167;2. Extent Of Long Block</a></li><li><a href="#SP3">&#167;3. Character Set</a></li><li><a href="#SP4">&#167;4. KOV Support</a></li><li><a href="#SP5">&#167;5. Debugging</a></li><li><a href="#SP6">&#167;6. Creation</a></li><li><a href="#SP7">&#167;7. Copy Short Block</a></li><li><a href="#SP8">&#167;8. Transmutation</a></li><li><a href="#SP9">&#167;9. Mutability</a></li><li><a href="#SP10">&#167;10. Casting</a></li><li><a href="#SP11">&#167;11. Data Conversion</a></li><li><a href="#SP12">&#167;12. Z Version</a></li><li><a href="#SP13">&#167;13. Glulx Version</a></li><li><a href="#SP14">&#167;14. Comparison</a></li><li><a href="#SP15">&#167;15. Hashing</a></li><li><a href="#SP16">&#167;16. Printing</a></li><li><a href="#SP17">&#167;17. Capitalised printing</a></li><li><a href="#SP18">&#167;18. Serialisation</a></li><li><a href="#SP19">&#167;19. Unserialisation</a></li><li><a href="#SP20">&#167;20. Substitution</a></li><li><a href="#SP21">&#167;21. Perishability</a></li><li><a href="#SP22">&#167;22. Blobs</a></li><li><a href="#SP23">&#167;23. Blob Access</a></li><li><a href="#SP24">&#167;24. Get Blob</a></li><li><a href="#SP25">&#167;25. Replace Blob</a></li><li><a href="#SP26">&#167;26. Replace Text</a></li><li><a href="#SP27">&#167;27. Character Length</a></li><li><a href="#SP28">&#167;28. Get Character</a></li><li><a href="#SP29">&#167;29. Casing</a></li><li><a href="#SP30">&#167;30. Change Case</a></li><li><a href="#SP31">&#167;31. Concatenation</a></li></ul><hr class="tocbar">
<p class="inwebparagraph"><a id="SP1"></a><b>&#167;1. Block Format. </b>The short block for a text is two words long: the first word selects which
form of storage will be used to represent the content, and the second word
is a reference to that content. This reference is an I6 String or Routine
in all cases except one, when it's a pointer to a long block containing
a null-terminated array of characters, like a C string.
</p>
<p class="inwebparagraph">Clearly we need <code class="display"><span class="extract">PACKED_TEXT_STORAGE</span></code> and <code class="display"><span class="extract">UNPACKED_TEXT_STORAGE</span></code> to
distinguish between the two basic methods of text storage, roughly
equivalent to the pre-2013 kinds "text" and "indexed text". But why
do we need four?
</p>
<p class="inwebparagraph"><code class="display"><span class="extract">CONSTANT_PACKED_TEXT_STORAGE</span></code> is easy to explain: the BlkValue routines
normally detect constants using metadata in their long blocks, but of
course that won't work for values which haven't got any long blocks.
We use this instead. We don't need a <code class="display"><span class="extract">CONSTANT_UNPACKED_TEXT_STORAGE</span></code>
because I7 never compiles constant text in unpacked form.
</p>
<p class="inwebparagraph">The surprising one is <code class="display"><span class="extract">CONSTANT_PERISHABLE_TEXT_STORAGE</span></code>. This is a
constant created by the I7 compiler which is marked as being tricky
because its value is a text substitution containing references to local
variables. Unlike other text substitutions, this can't meaningfully be
stored away to be expanded later: it must be expanded into unpacked
text before it perishes.
</p>
<pre class="display">
<span class="plain">Constant CONSTANT_PACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_CONSTANT + 1;</span>
<span class="plain">Constant CONSTANT_PERISHABLE_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_CONSTANT + 2;</span>
<span class="plain">Constant PACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + 3;</span>
<span class="plain">Constant UNPACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_LONGBLOCK + 4;</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP2"></a><b>&#167;2. Extent Of Long Block. </b>When there's a long block, we need enough of the entries to store the
number of characters, plus one for the null terminator.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Extent arg1 x;</span>
<span class="plain">x = BlkValueSeekZeroEntry(arg1);</span>
<span class="plain">if (x &lt; 0) return -1; ! should not happen, of course</span>
<span class="plain">return x+1;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP3"></a><b>&#167;3. Character Set. </b>On the Z-machine, we use the 8-bit ZSCII character set, stored in bytes;
on Glulx, we use the opening 16-bit subset of Unicode (which though only a
subset covers almost all letter forms used on Earth), stored in half-words.
</p>
<p class="inwebparagraph">The Z-machine does have very partial Unicode support, but not in a way that
can help us here. It is capable of printing a wide range of Unicode
characters, and on a good interpreter with a good font (such as Zoom for Mac
OS X, using the Lucida Grande font) can produce many thousands of glyphs. But
it is not capable of printing those characters into memory rather than the
screen, an essential technique for texts: it can only write each character to
a single byte, and it does so in ZSCII. That forces our hand when it comes to
choosing the indexed-text character set.
</p>
<pre class="display">
<span class="plain">#IFDEF TARGET_ZCODE;</span>
<span class="plain">Constant TEXT_TY_Storage_Flags = BLK_FLAG_MULTIPLE;</span>
<span class="plain">Constant ZSCII_Tables;</span>
<span class="plain">#IFNOT;</span>
<span class="plain">Constant TEXT_TY_Storage_Flags = BLK_FLAG_MULTIPLE + BLK_FLAG_16_BIT;</span>
<span class="plain">Constant Large_Unicode_Tables;</span>
<span class="plain">#ENDIF;</span>
<span class="plain">{-segment:UnicodeData.i6t}</span>
<span class="plain">{-segment:Char.i6t}</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP4"></a><b>&#167;4. KOV Support. </b>See the "BlockValues.i6t" segment for the specification of the following
routines. Because no block values are ever stored in a text, they can
freely be bitwise copied or forgotten, which is why we need do nothing
special to copy or destroy a text.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Support task arg1 arg2 arg3;</span>
<span class="plain">switch(task) {</span>
<span class="plain">CREATE_KOVS: return TEXT_TY_Create(arg2);</span>
<span class="plain">CAST_KOVS: TEXT_TY_Cast(arg1, arg2, arg3);</span>
<span class="plain">MAKEMUTABLE_KOVS: return TEXT_TY_Mutable(arg1);</span>
<span class="plain">COPYQUICK_KOVS: rtrue;</span>
<span class="plain">COPYSB_KOVS: TEXT_TY_CopySB(arg1, arg2);</span>
<span class="plain">KINDDATA_KOVS: return 0;</span>
<span class="plain">EXTENT_KOVS: return TEXT_TY_Extent(arg1);</span>
<span class="plain">COMPARE_KOVS: return TEXT_TY_Compare(arg1, arg2);</span>
<span class="plain">READ_FILE_KOVS: if (arg3 == -1) rtrue;</span>
<span class="plain">return TEXT_TY_ReadFile(arg1, arg2, arg3);</span>
<span class="plain">WRITE_FILE_KOVS: return TEXT_TY_WriteFile(arg1);</span>
<span class="plain">HASH_KOVS: return TEXT_TY_Hash(arg1);</span>
<span class="plain">DEBUG_KOVS: TEXT_TY_Debug(arg1);</span>
<span class="plain">}</span>
<span class="plain">! We choose not to respond to: DESTROY_KOVS, COPYKIND_KOVS, COPY_KOVS</span>
<span class="plain">rfalse;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP5"></a><b>&#167;5. Debugging. </b>This shows the various forms a text's short block can take:
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Debug txt;</span>
<span class="plain">switch (txt--&gt;0) {</span>
<span class="plain">CONSTANT_PACKED_TEXT_STORAGE: print " = cp~", (PrintI6Text) txt--&gt;1, "~";</span>
<span class="plain">CONSTANT_PERISHABLE_TEXT_STORAGE: print " = cp~", (PrintI6Text) txt--&gt;1, "~";</span>
<span class="plain">PACKED_TEXT_STORAGE: print " = p~", (PrintI6Text) txt--&gt;1, "~";</span>
<span class="plain">UNPACKED_TEXT_STORAGE: print " = ~", (TEXT_TY_Say) txt, "~";</span>
<span class="plain">default: print " broken?";</span>
<span class="plain">}</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP6"></a><b>&#167;6. Creation. </b>A newly created text is a two-word short block with no long block, like this:
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">Array ThisIsAText --&gt; PACKED_TEXT_STORAGE EMPTY_TEXT_PACKED;</span>
<span class="plain">[ TEXT_TY_Create short_block x;</span>
<span class="plain">return BlkValueCreateSB2(short_block, PACKED_TEXT_STORAGE, EMPTY_TEXT_PACKED);</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP7"></a><b>&#167;7. Copy Short Block. </b>When a short block for a constant is copied, the new copy isn't a constant
any more.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_CopySB to_bv from_bv;</span>
<span class="plain">BlkValueCopySB2(to_bv, from_bv);</span>
<span class="plain">if (to_bv--&gt;0 &amp; BLK_BVBITMAP_CONSTANTMASK) to_bv--&gt;0 = PACKED_TEXT_STORAGE;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP8"></a><b>&#167;8. Transmutation. </b>What happens if a text is stored in packed form, but we need to access or
change its individual characters? The answer is that we have to "transmute"
it into long block form. Sometimes this is a permanent change, but often
it's only temporary, and will soon be followed by an un-transmutation.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Transmute txt;</span>
<span class="plain">TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_Temporarily_Transmute txt x;</span>
<span class="plain">if ((txt) &amp;&amp; (txt--&gt;0 &amp; BLK_BVBITMAP_LONGBLOCKMASK == 0)) {</span>
<span class="plain">x = txt--&gt;1; ! The old value was a packed string</span>
<span class="plain">txt--&gt;0 = UNPACKED_TEXT_STORAGE;</span>
<span class="plain">txt--&gt;1 = FlexAllocate(32, TEXT_TY, TEXT_TY_Storage_Flags);</span>
<span class="plain">if (x ~= EMPTY_TEXT_PACKED) TEXT_TY_CastPrimitive(txt, false, x);</span>
<span class="plain">return x;</span>
<span class="plain">}</span>
<span class="plain">return 0;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_Untransmute txt pk cp x;</span>
<span class="plain">if ((pk) &amp;&amp; (txt--&gt;0 == UNPACKED_TEXT_STORAGE)) {</span>
<span class="plain">x = txt--&gt;1; ! The old value was an unpacked string</span>
<span class="plain">FlexFree(x);</span>
<span class="plain">txt--&gt;0 = cp;</span>
<span class="plain">txt--&gt;1 = pk; ! The value earlier returned by TEXT_TY_Temporarily_Transmute</span>
<span class="plain">}</span>
<span class="plain">return txt;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP9"></a><b>&#167;9. Mutability. </b>That neatly handles the question of how to make a text mutable. (Note that
constants are never created in unpacked form.)
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Mutable txt;</span>
<span class="plain">if (txt--&gt;0 &amp; BLK_BVBITMAP_LONGBLOCKMASK == 0) {</span>
<span class="plain">TEXT_TY_Transmute(txt);</span>
<span class="plain">return 0;</span>
<span class="plain">}</span>
<span class="plain">return 2; ! Tell BlockValue there's a long block pointer</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP10"></a><b>&#167;10. Casting. </b>In general computing, "casting" is the process of translating data in one
type into semantically equivalent data in another: the only interesting
cast here is that a snippet can be turned into a text.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Cast to_txt from_kind from_value;</span>
<span class="plain">if (from_kind == TEXT_TY) {</span>
<span class="plain">BlkValueCopy(to_txt, from_value);</span>
<span class="plain">} else if (from_kind == SNIPPET_TY) {</span>
<span class="plain">TEXT_TY_Transmute(to_txt);</span>
<span class="plain">TEXT_TY_CastPrimitive(to_txt, true, from_value);</span>
<span class="plain">} else BlkValueError("impossible cast to text");</span>
<span class="plain">];</span>
<span class="plain">[ SNIPPET_TY_to_TEXT_TY to_txt snippet;</span>
<span class="plain">return BlkValueCast(to_txt, SNIPPET_TY, snippet);</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP11"></a><b>&#167;11. Data Conversion. </b>We use a single routine to handle two kinds of format translation: a
packed I6 string into an unpacked text, or a snippet into an unpacked text.
</p>
<p class="inwebparagraph">In each case, what we do is simply to print out the value we have, but with
the output stream set to memory rather than the screen. That gives us the
character by character version, neatly laid out in an array, and all we have
to do is to copy it into the text and add a null termination byte.
</p>
<p class="inwebparagraph">What complicates things is that the two virtual machines handle printing
to memory quite differently, and that the original text has unpredictable
length. We are going to try printing it into the array <code class="display"><span class="extract">TEXT_TY_Buffers</span></code>,
but what if the text is too big? Disastrously, the Z-machine simply
writes on in memory, corrupting all subsequent arrays and almost certainly
causing the story file to crash soon after. There is nothing we can do
to predict or avoid this, or to repair the damage: this is why the Inform
documentation warns users to be wary of using text with large
strings in the Z-machine, and advises the use of Glulx instead. Glulx
does handle overruns safely, and indeed allows us to dynamically allocate
memory as necessary so that we can always avoid overruns entirely.
</p>
<pre class="display">
<span class="plain">Constant TEXT_TY_NoBuffers = 2;</span>
<span class="plain">#ifdef TARGET_ZCODE;</span>
<span class="plain">Array TEXT_TY_Buffers -&gt; TEXT_TY_BufferSize*TEXT_TY_NoBuffers; ! Where characters are bytes</span>
<span class="plain">#ifnot;</span>
<span class="plain">Array TEXT_TY_Buffers --&gt; (TEXT_TY_BufferSize+2)*TEXT_TY_NoBuffers; ! Where characters are words</span>
<span class="plain">#endif;</span>
<span class="plain">Global RawBufferAddress = TEXT_TY_Buffers;</span>
<span class="plain">Global RawBufferSize = TEXT_TY_BufferSize;</span>
<span class="plain">Global TEXT_TY_CastPrimitiveNesting = 0;</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP12"></a><b>&#167;12. Z Version. </b>The two versions of this routine, one for each virtual machine, are in all
important respects the same, but there are enough fiddly differences that
it's clearer to give two definitions, so:
</p>
<pre class="display">
<span class="plain">#ifdef TARGET_ZCODE;</span>
<span class="plain">[ TEXT_TY_CastPrimitive to_txt from_snippet from_value len news buffer;</span>
<span class="plain">if (to_txt == 0) BlkValueError("no destination for cast");</span>
<span class="plain">SuspendRTP();</span>
<span class="plain">buffer = RawBufferAddress + TEXT_TY_CastPrimitiveNesting*TEXT_TY_BufferSize;</span>
<span class="plain">TEXT_TY_CastPrimitiveNesting++;</span>
<span class="plain">if (TEXT_TY_CastPrimitiveNesting &gt; TEXT_TY_NoBuffers)</span>
<span class="plain">FlexError("ran out with too many simultaneous text conversions");</span>
<span class="plain">@push say__p; @push say__pc;</span>
<span class="plain">ClearParagraphing(6);</span>
<span class="plain">@output_stream 3 buffer;</span>
<span class="plain">if (from_value) {</span>
<span class="plain">if (from_snippet) print (PrintSnippet) from_value;</span>
<span class="plain">else print (PrintI6Text) from_value;</span>
<span class="plain">}</span>
<span class="plain">@output_stream -3;</span>
<span class="plain">@pull say__pc; @pull say__p;</span>
<span class="plain">ResumeRTP();</span>
<span class="plain">len = buffer--&gt;0;</span>
<span class="plain">if (len &gt; RawBufferSize-1) len = RawBufferSize-1;</span>
<span class="plain">buffer-&gt;(len+2) = 0;</span>
<span class="plain">TEXT_TY_CastPrimitiveNesting--;</span>
<span class="plain">BlkValueMassCopyFromArray(to_txt, buffer+2, 1, len+1);</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP13"></a><b>&#167;13. Glulx Version. </b></p>
<pre class="display">
<span class="plain">#ifnot; ! TARGET_ZCODE</span>
<span class="plain">[ TEXT_TY_CastPrimitive to_txt from_snippet from_value</span>
<span class="plain">len i stream saved_stream news buffer buffer_size memory_to_free results;</span>
<span class="plain">if (to_txt == 0) BlkValueError("no destination for cast");</span>
<span class="plain">buffer_size = (TEXT_TY_BufferSize + 2)*WORDSIZE;</span>
<span class="plain">RawBufferSize = TEXT_TY_BufferSize;</span>
<span class="plain">buffer = RawBufferAddress + TEXT_TY_CastPrimitiveNesting*buffer_size;</span>
<span class="plain">TEXT_TY_CastPrimitiveNesting++;</span>
<span class="plain">if (TEXT_TY_CastPrimitiveNesting &gt; TEXT_TY_NoBuffers) {</span>
<span class="plain">buffer = VM_AllocateMemory(buffer_size); memory_to_free = buffer;</span>
<span class="plain">if (buffer == 0)</span>
<span class="plain">FlexError("ran out with too many simultaneous text conversions");</span>
<span class="plain">}</span>
<span class="plain">if (unicode_gestalt_ok) {</span>
<span class="plain">SuspendRTP();</span>
<span class="plain">.RetryWithLargerBuffer;</span>
<span class="plain">saved_stream = glk_stream_get_current();</span>
<span class="plain">stream = glk_stream_open_memory_uni(buffer, RawBufferSize, filemode_Write, 0);</span>
<span class="plain">glk_stream_set_current(stream);</span>
<span class="plain">@push say__p; @push say__pc;</span>
<span class="plain">ClearParagraphing(7);</span>
<span class="plain">if (from_snippet) print (PrintSnippet) from_value;</span>
<span class="plain">else print (PrintI6Text) from_value;</span>
<span class="plain">@pull say__pc; @pull say__p;</span>
<span class="plain">results = buffer + buffer_size - 2*WORDSIZE;</span>
<span class="plain">glk_stream_close(stream, results);</span>
<span class="plain">if (saved_stream) glk_stream_set_current(saved_stream);</span>
<span class="plain">ResumeRTP();</span>
<span class="plain">len = results--&gt;1;</span>
<span class="plain">if (len &gt; RawBufferSize-1) {</span>
<span class="plain">! Glulx had to truncate text output because the buffer ran out:</span>
<span class="plain">! len is the number of characters which it tried to print</span>
<span class="plain">news = RawBufferSize;</span>
<span class="plain">while (news &lt; len) news=news*2;</span>
<span class="plain">i = VM_AllocateMemory(news*WORDSIZE);</span>
<span class="plain">if (i ~= 0) {</span>
<span class="plain">if (memory_to_free) VM_FreeMemory(memory_to_free);</span>
<span class="plain">memory_to_free = i;</span>
<span class="plain">buffer = i;</span>
<span class="plain">RawBufferSize = news;</span>
<span class="plain">buffer_size = (RawBufferSize + 2)*WORDSIZE;</span>
<span class="plain">jump RetryWithLargerBuffer;</span>
<span class="plain">}</span>
<span class="plain">! Memory allocation refused: all we can do is to truncate the text</span>
<span class="plain">len = RawBufferSize-1;</span>
<span class="plain">}</span>
<span class="plain">buffer--&gt;(len) = 0;</span>
<span class="plain">TEXT_TY_CastPrimitiveNesting--;</span>
<span class="plain">BlkValueMassCopyFromArray(to_txt, buffer, 4, len+1);</span>
<span class="plain">} else {</span>
<span class="plain">RunTimeProblem(RTP_NOGLULXUNICODE);</span>
<span class="plain">}</span>
<span class="plain">if (memory_to_free) VM_FreeMemory(memory_to_free);</span>
<span class="plain">];</span>
<span class="plain">#endif;</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP14"></a><b>&#167;14. Comparison. </b>This is more or less <code class="display"><span class="extract">strcmp</span></code>, the traditional C library routine for comparing
strings, but it does pose a few interesting questions. The answers are:
</p>
<p class="inwebparagraph"></p>
<ul class="items"><li>(a) Two different unexpanded texts with substitutions are never equal, so
"[X]" and "[Y]" aren't equal as texts even if X and Y are equal.
</li><li>(b) Otherwise we test the current value of the text as expanded, so "[X]"
and "17" can be equal as texts if X is 17.
</li></ul>
<pre class="display">
<span class="plain">[ TEXT_TY_Compare left_txt right_txt rv;</span>
<span class="plain">@push say__comp;</span>
<span class="plain">say__comp = true;</span>
<span class="plain">rv = TEXT_TY_Compare_Inner(left_txt, right_txt);</span>
<span class="plain">@pull say__comp;</span>
<span class="plain">return rv;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_Compare_Inner left_txt right_txt</span>
<span class="plain">pos ch1 ch2 capacity_left capacity_right fl fr cl cr cpl cpr;</span>
<span class="plain">if (left_txt--&gt;0 &amp; BLK_BVBITMAP_LONGBLOCKMASK == 0) fl = true;</span>
<span class="plain">if (right_txt--&gt;0 &amp; BLK_BVBITMAP_LONGBLOCKMASK == 0) fr = true;</span>
<span class="plain">if (fl &amp;&amp; fr) {</span>
<span class="plain">if ((left_txt--&gt;1 ofclass String) &amp;&amp; (right_txt--&gt;1 ofclass String))</span>
<span class="plain">return left_txt--&gt;1 - right_txt--&gt;1;</span>
<span class="plain">if ((left_txt--&gt;1 ofclass Routine) &amp;&amp; (right_txt--&gt;1 ofclass Routine))</span>
<span class="plain">return left_txt--&gt;1 - right_txt--&gt;1;</span>
<span class="plain">cpl = left_txt--&gt;0; cl = TEXT_TY_Temporarily_Transmute(left_txt);</span>
<span class="plain">cpr = right_txt--&gt;0; cr = TEXT_TY_Temporarily_Transmute(right_txt);</span>
<span class="plain">} else if (fl) {</span>
<span class="plain">cpl = left_txt--&gt;0; cl = TEXT_TY_Temporarily_Transmute(left_txt);</span>
<span class="plain">} else if (fr) {</span>
<span class="plain">cpr = right_txt--&gt;0; cr = TEXT_TY_Temporarily_Transmute(right_txt);</span>
<span class="plain">}</span>
<span class="plain">if ((cl) || (cr)) {</span>
<span class="plain">pos = TEXT_TY_Compare(left_txt, right_txt);</span>
<span class="plain">TEXT_TY_Untransmute(left_txt, cl, cpl);</span>
<span class="plain">TEXT_TY_Untransmute(right_txt, cr, cpr);</span>
<span class="plain">return pos;</span>
<span class="plain">}</span>
<span class="plain">capacity_left = BlkValueLBCapacity(left_txt);</span>
<span class="plain">capacity_right = BlkValueLBCapacity(right_txt);</span>
<span class="plain">for (pos=0:(pos&lt;capacity_left) &amp;&amp; (pos&lt;capacity_right):pos++) {</span>
<span class="plain">ch1 = BlkValueRead(left_txt, pos);</span>
<span class="plain">ch2 = BlkValueRead(right_txt, pos);</span>
<span class="plain">if (ch1 ~= ch2) return ch1-ch2;</span>
<span class="plain">if (ch1 == 0) return 0;</span>
<span class="plain">}</span>
<span class="plain">if (pos == capacity_left) return -1;</span>
<span class="plain">return 1;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_Distinguish left_txt right_txt;</span>
<span class="plain">if (TEXT_TY_Compare(left_txt, right_txt) == 0) rfalse;</span>
<span class="plain">rtrue;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP15"></a><b>&#167;15. Hashing. </b>This calculates a hash value for the string, using Bernstein's algorithm.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Hash txt rv len i p cp;</span>
<span class="plain">cp = txt--&gt;0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">rv = 0;</span>
<span class="plain">len = BlkValueLBCapacity(txt);</span>
<span class="plain">for (i=0: i&lt;len: i++)</span>
<span class="plain">rv = rv * 33 + BlkValueRead(txt, i);</span>
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
<span class="plain">return rv;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP16"></a><b>&#167;16. Printing. </b>Unicode is not the native character set on Glulx: it came along as a late
addition to Glulx's specification. The deal is that we have to explicitly
tell the Glk interface layer to perform certain operations in a Unicode way;
if we simply perform <code class="display"><span class="extract">print (char) ch;</span></code> then the character <code class="display"><span class="extract">ch</span></code> will be
printed in ZSCII rather than Unicode.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Say txt ch i dsize;</span>
<span class="plain">if (txt==0) rfalse;</span>
<span class="plain">if (txt--&gt;0 &amp; BLK_BVBITMAP_LONGBLOCKMASK == 0) return PrintI6Text(txt--&gt;1);</span>
<span class="plain">dsize = BlkValueLBCapacity(txt);</span>
<span class="plain">for (i=0: i&lt;dsize: i++) {</span>
<span class="plain">ch = BlkValueRead(txt, i);</span>
<span class="plain">if (ch == 0) break;</span>
<span class="plain">#ifdef TARGET_ZCODE;</span>
<span class="plain">print (char) ch;</span>
<span class="plain">#ifnot; ! TARGET_ZCODE</span>
<span class="plain">@streamunichar ch;</span>
<span class="plain">#endif;</span>
<span class="plain">}</span>
<span class="plain">if (i == 0) rfalse;</span>
<span class="plain">rtrue;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP17"></a><b>&#167;17. Capitalised printing. </b>It turns out to be useful to have a variation on this:
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Say_Capitalised txt mod rc;</span>
<span class="plain">mod = BlkValueCreate(TEXT_TY);</span>
<span class="plain">TEXT_TY_SubstitutedForm(mod, txt);</span>
<span class="plain">if (TEXT_TY_CharacterLength(mod) &gt; 0) {</span>
<span class="plain">BlkValueWrite(mod, 0, CharToCase(BlkValueRead(mod, 0), 1));</span>
<span class="plain">TEXT_TY_Say(mod);</span>
<span class="plain">rc = true;</span>
<span class="plain">say__p = 1;</span>
<span class="plain">}</span>
<span class="plain">BlkValueFree(mod);</span>
<span class="plain">return rc;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP18"></a><b>&#167;18. Serialisation. </b>Here we print a serialised form of a text which can later be used
to reconstruct the original text. The printing is apparently to the screen,
but in fact always takes place when the output stream is a file.
</p>
<p class="inwebparagraph">The format chosen is a letter "S" for string, then a comma-separated list
of decimal character codes, ending with the null terminator, and followed by
a semicolon: thus <code class="display"><span class="extract">S65,66,67,0;</span></code> is the serialised form of the text "ABC".
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_WriteFile txt len pos ch p cp;</span>
<span class="plain">cp = txt--&gt;0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">len = BlkValueLBCapacity(txt);</span>
<span class="plain">print "S";</span>
<span class="plain">for (pos=0: pos&lt;=len: pos++) {</span>
<span class="plain">if (pos == len) ch = 0; else ch = BlkValueRead(txt, pos);</span>
<span class="plain">if (ch == 0) {</span>
<span class="plain">print "0;"; break;</span>
<span class="plain">} else {</span>
<span class="plain">print ch, ",";</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP19"></a><b>&#167;19. Unserialisation. </b>If that's the word: the reverse process, in which we read a stream of
characters from a file and reconstruct the text which gave rise to
them.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_ReadFile txt auxf ch i v dg pos tsize p;</span>
<span class="plain">TEXT_TY_Transmute(txt);</span>
<span class="plain">tsize = BlkValueLBCapacity(txt);</span>
<span class="plain">while (ch ~= 32 or 9 or 10 or 13 or 0 or -1) {</span>
<span class="plain">ch = FileIO_GetC(auxf);</span>
<span class="plain">if (ch == ',' or ';') {</span>
<span class="plain">if (pos+1 &gt;= tsize) {</span>
<span class="plain">if (BlkValueSetLBCapacity(txt, 2*pos) == false) break;</span>
<span class="plain">tsize = BlkValueLBCapacity(txt);</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(txt, pos++, v);</span>
<span class="plain">v = 0;</span>
<span class="plain">if (ch == ';') break;</span>
<span class="plain">} else {</span>
<span class="plain">dg = ch - '0';</span>
<span class="plain">v = v*10 + dg;</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(txt, pos, 0);</span>
<span class="plain">return txt;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP20"></a><b>&#167;20. Substitution. </b></p>
<pre class="display">
<span class="plain">[ TEXT_TY_SubstitutedForm to txt;</span>
<span class="plain">if (txt) {</span>
<span class="plain">BlkValueCopy(to, txt);</span>
<span class="plain">TEXT_TY_Transmute(to);</span>
<span class="plain">}</span>
<span class="plain">return to;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_IsSubstituted txt;</span>
<span class="plain">if ((txt) &amp;&amp;</span>
<span class="plain">(txt--&gt;0 &amp; BLK_BVBITMAP_LONGBLOCKMASK == 0) &amp;&amp;</span>
<span class="plain">(txt--&gt;1 ofclass Routine)) rfalse;</span>
<span class="plain">rtrue;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP21"></a><b>&#167;21. Perishability. </b>As noted above, a perishable constant is one which must be expanded before
the values it refers to vanish from existence.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_ExpandIfPerishable to from;</span>
<span class="plain">if ((from) &amp;&amp; (from--&gt;0 == CONSTANT_PERISHABLE_TEXT_STORAGE))</span>
<span class="plain">return TEXT_TY_SubstitutedForm(to, from);</span>
<span class="plain">return from;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP22"></a><b>&#167;22. Blobs. </b>That completes the compulsory services required for this KOV to function:
from here on, the remaining routines provide definitions of text-related
phrases in the Standard Rules.
</p>
<p class="inwebparagraph">What are the basic operations of text-handling? Clearly we want to be able
to search, and replace, but that is left for the segment "RegExp.i6t"
to handle. More basically we would like to be able to read and write
characters from the text. But texts in I7 tend to be of natural language,
rather than containing arbitrary material &mdash; that's indeed why we call them
texts rather than strings. This means they are likely to be punctuated
sequences of words, divided up perhaps into sentences and even paragraphs.
</p>
<p class="inwebparagraph">So we provide facilities which regard a text as being an array of "blobs",
where a "blob" is a unit of text. The user can choose whether to see it
as an array of characters, or words (of three different sorts: see the
Inform documentation for details), or paragraphs, or lines.
</p>
<pre class="display">
<span class="plain">Constant CHR_BLOB = 1; ! Construe as an array of characters</span>
<span class="plain">Constant WORD_BLOB = 2; ! Of words</span>
<span class="plain">Constant PWORD_BLOB = 3; ! Of punctuated words</span>
<span class="plain">Constant UWORD_BLOB = 4; ! Of unpunctuated words</span>
<span class="plain">Constant PARA_BLOB = 5; ! Of paragraphs</span>
<span class="plain">Constant LINE_BLOB = 6; ! Of lines</span>
<span class="plain">Constant REGEXP_BLOB = 7; ! Not a blob type as such, but needed as a distinct value</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP23"></a><b>&#167;23. Blob Access. </b>The following routine runs a small finite-state-machine to count the number
of blobs in a text, using any of the above blob types (except
<code class="display"><span class="extract">REGEXP_BLOB</span></code>, which is used for other purposes). If the optional arguments
<code class="display"><span class="extract">ctxt</span></code> and <code class="display"><span class="extract">wanted</span></code> are supplied, it also copies the text of blob number
<code class="display"><span class="extract">wanted</span></code> (counting upwards from 1 at the start of the text) into the
text <code class="display"><span class="extract">ctxt</span></code>. If the further optional argument <code class="display"><span class="extract">rtxt</span></code> is supplied,
then <code class="display"><span class="extract">ctxt</span></code> is instead written with the original text <code class="display"><span class="extract">txt</span></code> as it would
read if the blob in question were replaced with the text in <code class="display"><span class="extract">rtxt</span></code>.
</p>
<pre class="display">
<span class="plain">Constant WS_BRM = 1;</span>
<span class="plain">Constant SKIPPED_BRM = 2;</span>
<span class="plain">Constant ACCEPTED_BRM = 3;</span>
<span class="plain">Constant ACCEPTEDP_BRM = 4;</span>
<span class="plain">Constant ACCEPTEDN_BRM = 5;</span>
<span class="plain">Constant ACCEPTEDPN_BRM = 6;</span>
<span class="plain">[ TEXT_TY_BlobAccess txt blobtype ctxt wanted rtxt</span>
<span class="plain">p1 p2 cp1 cp2 r;</span>
<span class="plain">if (txt==0) return 0;</span>
<span class="plain">if (blobtype == CHR_BLOB) return TEXT_TY_CharacterLength(txt);</span>
<span class="plain">cp1 = txt--&gt;0; p1 = TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">cp2 = rtxt--&gt;0; p2 = TEXT_TY_Temporarily_Transmute(rtxt);</span>
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
<span class="plain">r = TEXT_TY_BlobAccessI(txt, blobtype, ctxt, wanted, rtxt);</span>
<span class="plain">TEXT_TY_Untransmute(txt, p1, cp1);</span>
<span class="plain">TEXT_TY_Untransmute(rtxt, p2, cp2);</span>
<span class="plain">return r;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_BlobAccessI txt blobtype ctxt wanted rtxt</span>
<span class="plain">brm oldbrm ch i dsize csize blobcount gp cl j;</span>
<span class="plain">dsize = BlkValueLBCapacity(txt);</span>
<span class="plain">if (ctxt) csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">else if (rtxt) "*** rtxt without ctxt ***";</span>
<span class="plain">brm = WS_BRM;</span>
<span class="plain">for (i=0:i&lt;dsize:i++) {</span>
<span class="plain">ch = BlkValueRead(txt, i);</span>
<span class="plain">if (ch == 0) break;</span>
<span class="plain">oldbrm = brm;</span>
<span class="plain">if (ch == 10 or 13 or 32 or 9) {</span>
<span class="plain">if (oldbrm ~= WS_BRM) {</span>
<span class="plain">gp = 0;</span>
<span class="plain">for (j=i:j&lt;dsize:j++) {</span>
<span class="plain">ch = BlkValueRead(txt, j);</span>
<span class="plain">if (ch == 0) { brm = WS_BRM; break; }</span>
<span class="plain">if (ch == 10 or 13) { gp++; continue; }</span>
<span class="plain">if (ch ~= 32 or 9) break;</span>
<span class="plain">}</span>
<span class="plain">ch = BlkValueRead(txt, i);</span>
<span class="plain">if (j == dsize) brm = WS_BRM;</span>
<span class="plain">switch (blobtype) {</span>
<span class="plain">PARA_BLOB: if (gp &gt;= 2) brm = WS_BRM;</span>
<span class="plain">LINE_BLOB: if (gp &gt;= 1) brm = WS_BRM;</span>
<span class="plain">default: brm = WS_BRM;</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">} else {</span>
<span class="plain">gp = false;</span>
<span class="plain">if ((blobtype == WORD_BLOB or PWORD_BLOB or UWORD_BLOB) &amp;&amp;</span>
<span class="plain">(ch == '.' or ',' or '!' or '?'</span>
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}'))</span>
<span class="plain">gp = true;</span>
<span class="plain">switch (oldbrm) {</span>
<span class="plain">WS_BRM:</span>
<span class="plain">brm = ACCEPTED_BRM;</span>
<span class="plain">if (blobtype == WORD_BLOB) {</span>
<span class="plain">if (gp) brm = SKIPPED_BRM;</span>
<span class="plain">}</span>
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
<span class="plain">if (gp) brm = ACCEPTEDP_BRM;</span>
<span class="plain">}</span>
<span class="plain">SKIPPED_BRM:</span>
<span class="plain">if (blobtype == WORD_BLOB) {</span>
<span class="plain">if (gp == false) brm = ACCEPTED_BRM;</span>
<span class="plain">}</span>
<span class="plain">ACCEPTED_BRM:</span>
<span class="plain">if (blobtype == WORD_BLOB) {</span>
<span class="plain">if (gp) brm = SKIPPED_BRM;</span>
<span class="plain">}</span>
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
<span class="plain">if (gp) brm = ACCEPTEDP_BRM;</span>
<span class="plain">}</span>
<span class="plain">ACCEPTEDP_BRM:</span>
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
<span class="plain">if (gp == false) brm = ACCEPTED_BRM;</span>
<span class="plain">else {</span>
<span class="plain">if ((ch == BlkValueRead(txt, i-1)) &amp;&amp;</span>
<span class="plain">(ch == '-' or '.')) blobcount--;</span>
<span class="plain">blobcount++;</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">ACCEPTEDN_BRM:</span>
<span class="plain">if (blobtype == WORD_BLOB) {</span>
<span class="plain">if (gp) brm = SKIPPED_BRM;</span>
<span class="plain">}</span>
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
<span class="plain">if (gp) brm = ACCEPTEDP_BRM;</span>
<span class="plain">}</span>
<span class="plain">ACCEPTEDPN_BRM:</span>
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
<span class="plain">if (gp == false) brm = ACCEPTED_BRM;</span>
<span class="plain">else {</span>
<span class="plain">if ((ch == BlkValueRead(txt, i-1)) &amp;&amp;</span>
<span class="plain">(ch == '-' or '.')) blobcount--;</span>
<span class="plain">blobcount++;</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">if (brm == ACCEPTED_BRM or ACCEPTEDP_BRM) {</span>
<span class="plain">if (oldbrm ~= brm) blobcount++;</span>
<span class="plain">if ((ctxt) &amp;&amp; (blobcount == wanted)) {</span>
<span class="plain">if (rtxt) {</span>
<span class="plain">BlkValueWrite(ctxt, cl, 0);</span>
<span class="plain">TEXT_TY_Concatenate(ctxt, rtxt, CHR_BLOB);</span>
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">cl = TEXT_TY_CharacterLength(ctxt);</span>
<span class="plain">if (brm == ACCEPTED_BRM) brm = ACCEPTEDN_BRM;</span>
<span class="plain">if (brm == ACCEPTEDP_BRM) brm = ACCEPTEDPN_BRM;</span>
<span class="plain">} else {</span>
<span class="plain">if (cl+1 &gt;= csize) {</span>
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
<span class="plain">}</span>
<span class="plain">} else {</span>
<span class="plain">if (rtxt) {</span>
<span class="plain">if (cl+1 &gt;= csize) {</span>
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">} else {</span>
<span class="plain">if ((rtxt) &amp;&amp; (brm ~= ACCEPTEDN_BRM or ACCEPTEDPN_BRM)) {</span>
<span class="plain">if (cl+1 &gt;= csize) {</span>
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">if (ctxt) BlkValueWrite(ctxt, cl++, 0);</span>
<span class="plain">return blobcount;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP24"></a><b>&#167;24. Get Blob. </b>The front end which uses the above routine to read a blob. (Note that, for
efficiency's sake, we read characters more directly.)
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_GetBlob ctxt txt wanted blobtype;</span>
<span class="plain">if (txt==0) return;</span>
<span class="plain">if (blobtype == CHR_BLOB) return TEXT_TY_GetCharacter(ctxt, txt, wanted);</span>
<span class="plain">TEXT_TY_BlobAccess(txt, blobtype, ctxt, wanted);</span>
<span class="plain">return ctxt;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP25"></a><b>&#167;25. Replace Blob. </b>The front end which uses the above routine to replace a blob. (Once again,
characters are handled directly to avoid incurring all that overhead.)
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_ReplaceBlob blobtype txt wanted rtxt ctxt ilen rlen i p cp;</span>
<span class="plain">TEXT_TY_Transmute(txt);</span>
<span class="plain">cp = rtxt--&gt;0; p = TEXT_TY_Temporarily_Transmute(rtxt);</span>
<span class="plain">if (blobtype == CHR_BLOB) {</span>
<span class="plain">ilen = TEXT_TY_CharacterLength(txt);</span>
<span class="plain">rlen = TEXT_TY_CharacterLength(rtxt);</span>
<span class="plain">wanted--;</span>
<span class="plain">if ((wanted &gt;= 0) &amp;&amp; (wanted&lt;ilen)) {</span>
<span class="plain">if (rlen == 1) {</span>
<span class="plain">BlkValueWrite(txt, wanted, BlkValueRead(rtxt, 0));</span>
<span class="plain">} else {</span>
<span class="plain">ctxt = BlkValueCreate(TEXT_TY);</span>
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
<span class="plain">if (BlkValueSetLBCapacity(ctxt, ilen+rlen+1)) {</span>
<span class="plain">for (i=0:i&lt;wanted:i++)</span>
<span class="plain">BlkValueWrite(ctxt, i, BlkValueRead(txt, i));</span>
<span class="plain">for (i=0:i&lt;rlen:i++)</span>
<span class="plain">BlkValueWrite(ctxt, wanted+i, BlkValueRead(rtxt, i));</span>
<span class="plain">for (i=wanted+1:i&lt;ilen:i++)</span>
<span class="plain">BlkValueWrite(ctxt, rlen+i-1, BlkValueRead(txt, i));</span>
<span class="plain">BlkValueWrite(ctxt, rlen+ilen, 0);</span>
<span class="plain">BlkValueCopy(txt, ctxt);</span>
<span class="plain">}</span>
<span class="plain">BlkValueFree(ctxt);</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">} else {</span>
<span class="plain">ctxt = BlkValueCreate(TEXT_TY);</span>
<span class="plain">TEXT_TY_BlobAccess(txt, blobtype, ctxt, wanted, rtxt);</span>
<span class="plain">BlkValueCopy(txt, ctxt);</span>
<span class="plain">BlkValueFree(ctxt);</span>
<span class="plain">}</span>
<span class="plain">TEXT_TY_Untransmute(rtxt, p, cp);</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP26"></a><b>&#167;26. Replace Text. </b>This is the general routine which searches for any instance of <code class="display"><span class="extract">ftxt</span></code>,
as a blob, in <code class="display"><span class="extract">txt</span></code>, and replaces it with the text <code class="display"><span class="extract">rtxt</span></code>. It works on
any of the above blob-types, but two cases are special: first, if the
blob-type is <code class="display"><span class="extract">CHR_BLOB</span></code>, then it can do more than search and replace
for any instance of a single character: it can search and replace any
instance of a substring, so that <code class="display"><span class="extract">ftxt</span></code> is not required to be only a
single character. Second, if the blob-type is the special value
<code class="display"><span class="extract">REGEXP_BLOB</span></code> then <code class="display"><span class="extract">ftxt</span></code> is interpreted as a regular expression rather
than something literal to find: see "RegExp.i6t" for what happens next.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_ReplaceText blobtype txt ftxt rtxt</span>
<span class="plain">r p1 p2 cp1 cp2;</span>
<span class="plain">TEXT_TY_Transmute(txt);</span>
<span class="plain">cp1 = ftxt--&gt;0; p1 = TEXT_TY_Temporarily_Transmute(ftxt);</span>
<span class="plain">cp2 = rtxt--&gt;0; p2 = TEXT_TY_Temporarily_Transmute(rtxt);</span>
<span class="plain">r = TEXT_TY_ReplaceTextI(blobtype, txt, ftxt, rtxt);</span>
<span class="plain">TEXT_TY_Untransmute(ftxt, p1, cp1);</span>
<span class="plain">TEXT_TY_Untransmute(rtxt, p2, cp2);</span>
<span class="plain">return r;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_ReplaceTextI blobtype txt ftxt rtxt</span>
<span class="plain">ctxt csize ilen flen i cl mpos ch chm whitespace punctuation;</span>
<span class="plain">if (blobtype == REGEXP_BLOB or CHR_BLOB)</span>
<span class="plain">return TEXT_TY_Replace_RE(blobtype, txt, ftxt, rtxt);</span>
<span class="plain">ilen = TEXT_TY_CharacterLength(txt);</span>
<span class="plain">flen = TEXT_TY_CharacterLength(ftxt);</span>
<span class="plain">ctxt = BlkValueCreate(TEXT_TY);</span>
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">mpos = 0;</span>
<span class="plain">whitespace = true; punctuation = false;</span>
<span class="plain">for (i=0:i&lt;=ilen:i++) {</span>
<span class="plain">ch = BlkValueRead(txt, i);</span>
<span class="plain">.MoreMatching;</span>
<span class="plain">chm = BlkValueRead(ftxt, mpos++);</span>
<span class="plain">if (mpos == 1) {</span>
<span class="plain">switch (blobtype) {</span>
<span class="plain">WORD_BLOB:</span>
<span class="plain">if ((whitespace == false) &amp;&amp; (punctuation == false)) chm = -1;</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">whitespace = false;</span>
<span class="plain">if (ch == 10 or 13 or 32 or 9) whitespace = true;</span>
<span class="plain">punctuation = false;</span>
<span class="plain">if (ch == '.' or ',' or '!' or '?'</span>
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}') {</span>
<span class="plain">if (blobtype == WORD_BLOB) chm = -1;</span>
<span class="plain">punctuation = true;</span>
<span class="plain">}</span>
<span class="plain">if (ch == chm) {</span>
<span class="plain">if (mpos == flen) {</span>
<span class="plain">if (i == ilen) chm = 0;</span>
<span class="plain">else chm = BlkValueRead(txt, i+1);</span>
<span class="plain">if ((blobtype == CHR_BLOB) ||</span>
<span class="plain">(chm == 0 or 10 or 13 or 32 or 9) ||</span>
<span class="plain">(chm == '.' or ',' or '!' or '?'</span>
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}')) {</span>
<span class="plain">mpos = 0;</span>
<span class="plain">cl = cl - (flen-1);</span>
<span class="plain">BlkValueWrite(ctxt, cl, 0);</span>
<span class="plain">TEXT_TY_Concatenate(ctxt, rtxt, CHR_BLOB);</span>
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">cl = TEXT_TY_CharacterLength(ctxt);</span>
<span class="plain">continue;</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">} else {</span>
<span class="plain">mpos = 0;</span>
<span class="plain">}</span>
<span class="plain">if (cl+1 &gt;= csize) {</span>
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
<span class="plain">}</span>
<span class="plain">BlkValueCopy(txt, ctxt);</span>
<span class="plain">BlkValueFree(ctxt);</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP27"></a><b>&#167;27. Character Length. </b>When accessing at the character-by-character level, things are much easier
and we needn't go through any finite state machine palaver.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_CharacterLength txt ch i dsize p cp r;</span>
<span class="plain">if (txt==0) return 0;</span>
<span class="plain">cp = txt--&gt;0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">dsize = BlkValueLBCapacity(txt); r = dsize;</span>
<span class="plain">for (i=0:i&lt;dsize:i++) {</span>
<span class="plain">ch = BlkValueRead(txt, i);</span>
<span class="plain">if (ch == 0) { r = i; break; }</span>
<span class="plain">}</span>
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
<span class="plain">return r;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_Empty txt;</span>
<span class="plain">if (txt==0) rtrue;</span>
<span class="plain">if (txt--&gt;0 &amp; BLK_BVBITMAP_LONGBLOCKMASK == 0) {</span>
<span class="plain">if (txt--&gt;1 == EMPTY_TEXT_PACKED) rtrue;</span>
<span class="plain">rfalse;</span>
<span class="plain">}</span>
<span class="plain">if (TEXT_TY_CharacterLength(txt) == 0) rtrue;</span>
<span class="plain">rfalse;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP28"></a><b>&#167;28. Get Character. </b>Characters in a text are numbered upwards from 1 by the users of this
routine: which is why we subtract 1 when reading the array in the
block-value, which counts from 0.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_GetCharacter ctxt txt i ch p cp;</span>
<span class="plain">if (txt==0) return 0;</span>
<span class="plain">cp = txt--&gt;0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
<span class="plain">if ((i&lt;=0) || (i&gt;TEXT_TY_CharacterLength(txt))) ch = 0;</span>
<span class="plain">else ch = BlkValueRead(txt, i-1);</span>
<span class="plain">BlkValueWrite(ctxt, 0, ch);</span>
<span class="plain">BlkValueWrite(ctxt, 1, 0);</span>
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
<span class="plain">return ctxt;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP29"></a><b>&#167;29. Casing. </b>In many programming languages, characters are a distinct data type from
strings, but not in I7. To I7, a character is simply a text which
happens to have length 1 &mdash; this has its inefficiencies, but is conceptually
easy for the user.
</p>
<p class="inwebparagraph"><code class="display"><span class="extract">TEXT_TY_CharactersOfCase(txt, case)</span></code> determines whether all the characters in <code class="display"><span class="extract">txt</span></code>
are letters of the given casing: 0 for lower case, 1 for upper case. In the
case of ZSCII, this is done correctly handling all of the European accented
letters; in the case of Unicode, it follows the Unicode standard.
</p>
<p class="inwebparagraph">Note that there is no requirement for <code class="display"><span class="extract">txt</span></code> to be only a single character
long.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_CharactersOfCase txt case i ch len p cp r;</span>
<span class="plain">if (txt==0) return 0;</span>
<span class="plain">cp = txt--&gt;0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">len = TEXT_TY_CharacterLength(txt);</span>
<span class="plain">r = true;</span>
<span class="plain">for (i=0:i&lt;len:i++) {</span>
<span class="plain">ch = BlkValueRead(txt, i);</span>
<span class="plain">if ((ch) &amp;&amp; (CharIsOfCase(ch, case) == false)) { r = false; break; }</span>
<span class="plain">}</span>
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
<span class="plain">return r;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP30"></a><b>&#167;30. Change Case. </b>We set <code class="display"><span class="extract">ctxt</span></code> to the text in <code class="display"><span class="extract">txt</span></code>, except that all the letters are
converted to the <code class="display"><span class="extract">case</span></code> given (0 for lower, 1 for upper). The definition
of what is a "letter", what case it has and what the other-case form is
are as specified in the ZSCII and Unicode standards.
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_CharactersToCase ctxt txt case i ch len bnd pk cp;</span>
<span class="plain">if (txt==0) return 0;</span>
<span class="plain">cp = txt--&gt;0; pk = TEXT_TY_Temporarily_Transmute(txt);</span>
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
<span class="plain">len = TEXT_TY_CharacterLength(txt);</span>
<span class="plain">if (BlkValueSetLBCapacity(ctxt, len+1)) {</span>
<span class="plain">bnd = 1;</span>
<span class="plain">for (i=0:i&lt;len:i++) {</span>
<span class="plain">ch = BlkValueRead(txt, i);</span>
<span class="plain">if (case &lt; 2) {</span>
<span class="plain">BlkValueWrite(ctxt, i, CharToCase(ch, case));</span>
<span class="plain">} else {</span>
<span class="plain">BlkValueWrite(ctxt, i, CharToCase(ch, bnd));</span>
<span class="plain">if (case == 2) {</span>
<span class="plain">bnd = 0;</span>
<span class="plain">if (ch == 0 or 10 or 13 or 32 or 9</span>
<span class="plain">or '.' or ',' or '!' or '?'</span>
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}') bnd = 1;</span>
<span class="plain">}</span>
<span class="plain">if (case == 3) {</span>
<span class="plain">if (ch ~= 0 or 10 or 13 or 32 or 9) {</span>
<span class="plain">if (bnd == 1) bnd = 0;</span>
<span class="plain">else {</span>
<span class="plain">if (ch == '.' or '!' or '?') bnd = 1;</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(ctxt, len, 0);</span>
<span class="plain">}</span>
<span class="plain">TEXT_TY_Untransmute(txt, pk, cp);</span>
<span class="plain">return ctxt;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP31"></a><b>&#167;31. Concatenation. </b>To concatenate two texts is to place one after the other: thus "green"
concatenated with "horn" makes "greenhorn". In this routine, <code class="display"><span class="extract">from_txt</span></code>
would be "horn", and is added at the end of <code class="display"><span class="extract">to_txt</span></code>, which is returned in
its expanded state.
</p>
<p class="inwebparagraph">When the blob type is <code class="display"><span class="extract">REGEXP_BLOB</span></code>, the routine is used not for simple
concatenation but to handle the concatenations occurring when a regular
expression search-and-replace is going on: see "RegExp.i6t".
</p>
<pre class="display">
<span class="plain">[ TEXT_TY_Concatenate to_txt from_txt blobtype ref_txt</span>
<span class="plain">p cp r;</span>
<span class="plain">if (to_txt==0) rfalse;</span>
<span class="plain">if (from_txt==0) return to_txt;</span>
<span class="plain">TEXT_TY_Transmute(to_txt);</span>
<span class="plain">cp = from_txt--&gt;0; p = TEXT_TY_Temporarily_Transmute(from_txt);</span>
<span class="plain">r = TEXT_TY_ConcatenateI(to_txt, from_txt, blobtype, ref_txt);</span>
<span class="plain">TEXT_TY_Untransmute(from_txt, p, cp);</span>
<span class="plain">return r;</span>
<span class="plain">];</span>
<span class="plain">[ TEXT_TY_ConcatenateI to_txt from_txt blobtype ref_txt</span>
<span class="plain">pos len ch i tosize x y case;</span>
<span class="plain">switch(blobtype) {</span>
<span class="plain">CHR_BLOB, 0:</span>
<span class="plain">pos = TEXT_TY_CharacterLength(to_txt);</span>
<span class="plain">len = TEXT_TY_CharacterLength(from_txt);</span>
<span class="plain">if (BlkValueSetLBCapacity(to_txt, pos+len+1) == false) return to_txt;</span>
<span class="plain">for (i=0:i&lt;len:i++) {</span>
<span class="plain">ch = BlkValueRead(from_txt, i);</span>
<span class="plain">BlkValueWrite(to_txt, i+pos, ch);</span>
<span class="plain">}</span>
<span class="plain">BlkValueWrite(to_txt, len+pos, 0);</span>
<span class="plain">return to_txt;</span>
<span class="plain">REGEXP_BLOB:</span>
<span class="plain">return TEXT_TY_RE_Concatenate(to_txt, from_txt, blobtype, ref_txt);</span>
<span class="plain">}</span>
<span class="plain">print "*** TEXT_TY_Concatenate used on impossible blob type ***^";</span>
<span class="plain">rfalse;</span>
<span class="plain">];</span>
</pre>
<p class="inwebparagraph"></p>
<hr class="tocbar">
<ul class="toc"><li><a href="S-tt.html">Back to 'Tables Template'</a></li><li><a href="S-ut.html">Continue with 'UnicodeData Template'</a></li></ul><hr class="tocbar">
<!--End of weave-->
</body>
</html>