mirror of
https://github.com/ganelson/inform.git
synced 2024-07-08 01:54:21 +03:00
1163 lines
65 KiB
HTML
1163 lines
65 KiB
HTML
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
||
|
<html>
|
||
|
<head>
|
||
|
<title>S/tt</title>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||
|
<meta http-equiv="Content-Language" content="en-gb">
|
||
|
<link href="inweb.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
||
|
</head>
|
||
|
<body>
|
||
|
|
||
|
<!--Weave of 'S/tt2' generated by 7-->
|
||
|
<ul class="crumbs"><li><a href="../webs.html">★</a></li><li><a href="index.html">basic_inform Template Library</a></li><li><b>Text Template</b></li></ul><p class="purpose">Code to support the text kind of value.</p>
|
||
|
|
||
|
<ul class="toc"><li><a href="#SP1">§1. Block Format</a></li><li><a href="#SP2">§2. Extent Of Long Block</a></li><li><a href="#SP3">§3. Character Set</a></li><li><a href="#SP4">§4. KOV Support</a></li><li><a href="#SP5">§5. Debugging</a></li><li><a href="#SP6">§6. Creation</a></li><li><a href="#SP7">§7. Copy Short Block</a></li><li><a href="#SP8">§8. Transmutation</a></li><li><a href="#SP9">§9. Mutability</a></li><li><a href="#SP10">§10. Casting</a></li><li><a href="#SP11">§11. Data Conversion</a></li><li><a href="#SP12">§12. Z Version</a></li><li><a href="#SP13">§13. Glulx Version</a></li><li><a href="#SP14">§14. Comparison</a></li><li><a href="#SP15">§15. Hashing</a></li><li><a href="#SP16">§16. Printing</a></li><li><a href="#SP17">§17. Capitalised printing</a></li><li><a href="#SP18">§18. Serialisation</a></li><li><a href="#SP19">§19. Unserialisation</a></li><li><a href="#SP20">§20. Substitution</a></li><li><a href="#SP21">§21. Perishability</a></li><li><a href="#SP22">§22. Blobs</a></li><li><a href="#SP23">§23. Blob Access</a></li><li><a href="#SP24">§24. Get Blob</a></li><li><a href="#SP25">§25. Replace Blob</a></li><li><a href="#SP26">§26. Replace Text</a></li><li><a href="#SP27">§27. Character Length</a></li><li><a href="#SP28">§28. Get Character</a></li><li><a href="#SP29">§29. Casing</a></li><li><a href="#SP30">§30. Change Case</a></li><li><a href="#SP31">§31. Concatenation</a></li></ul><hr class="tocbar">
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP1"></a><b>§1. Block Format. </b>The short block for a text is two words long: the first word selects which
|
||
|
form of storage will be used to represent the content, and the second word
|
||
|
is a reference to that content. This reference is an I6 String or Routine
|
||
|
in all cases except one, when it's a pointer to a long block containing
|
||
|
a null-terminated array of characters, like a C string.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">Clearly we need <code class="display"><span class="extract">PACKED_TEXT_STORAGE</span></code> and <code class="display"><span class="extract">UNPACKED_TEXT_STORAGE</span></code> to
|
||
|
distinguish between the two basic methods of text storage, roughly
|
||
|
equivalent to the pre-2013 kinds "text" and "indexed text". But why
|
||
|
do we need four?
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph"><code class="display"><span class="extract">CONSTANT_PACKED_TEXT_STORAGE</span></code> is easy to explain: the BlkValue routines
|
||
|
normally detect constants using metadata in their long blocks, but of
|
||
|
course that won't work for values which haven't got any long blocks.
|
||
|
We use this instead. We don't need a <code class="display"><span class="extract">CONSTANT_UNPACKED_TEXT_STORAGE</span></code>
|
||
|
because I7 never compiles constant text in unpacked form.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">The surprising one is <code class="display"><span class="extract">CONSTANT_PERISHABLE_TEXT_STORAGE</span></code>. This is a
|
||
|
constant created by the I7 compiler which is marked as being tricky
|
||
|
because its value is a text substitution containing references to local
|
||
|
variables. Unlike other text substitutions, this can't meaningfully be
|
||
|
stored away to be expanded later: it must be expanded into unpacked
|
||
|
text before it perishes.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">Constant CONSTANT_PACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_CONSTANT + 1;</span>
|
||
|
<span class="plain">Constant CONSTANT_PERISHABLE_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_CONSTANT + 2;</span>
|
||
|
<span class="plain">Constant PACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + 3;</span>
|
||
|
<span class="plain">Constant UNPACKED_TEXT_STORAGE = BLK_BVBITMAP_TEXT + BLK_BVBITMAP_LONGBLOCK + 4;</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP2"></a><b>§2. Extent Of Long Block. </b>When there's a long block, we need enough of the entries to store the
|
||
|
number of characters, plus one for the null terminator.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Extent arg1 x;</span>
|
||
|
<span class="plain">x = BlkValueSeekZeroEntry(arg1);</span>
|
||
|
<span class="plain">if (x < 0) return -1; ! should not happen, of course</span>
|
||
|
<span class="plain">return x+1;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP3"></a><b>§3. Character Set. </b>On the Z-machine, we use the 8-bit ZSCII character set, stored in bytes;
|
||
|
on Glulx, we use the opening 16-bit subset of Unicode (which though only a
|
||
|
subset covers almost all letter forms used on Earth), stored in half-words.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">The Z-machine does have very partial Unicode support, but not in a way that
|
||
|
can help us here. It is capable of printing a wide range of Unicode
|
||
|
characters, and on a good interpreter with a good font (such as Zoom for Mac
|
||
|
OS X, using the Lucida Grande font) can produce many thousands of glyphs. But
|
||
|
it is not capable of printing those characters into memory rather than the
|
||
|
screen, an essential technique for texts: it can only write each character to
|
||
|
a single byte, and it does so in ZSCII. That forces our hand when it comes to
|
||
|
choosing the indexed-text character set.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">#IFDEF TARGET_ZCODE;</span>
|
||
|
<span class="plain">Constant TEXT_TY_Storage_Flags = BLK_FLAG_MULTIPLE;</span>
|
||
|
<span class="plain">Constant ZSCII_Tables;</span>
|
||
|
<span class="plain">#IFNOT;</span>
|
||
|
<span class="plain">Constant TEXT_TY_Storage_Flags = BLK_FLAG_MULTIPLE + BLK_FLAG_16_BIT;</span>
|
||
|
<span class="plain">Constant Large_Unicode_Tables;</span>
|
||
|
<span class="plain">#ENDIF;</span>
|
||
|
|
||
|
<span class="plain">{-segment:UnicodeData.i6t}</span>
|
||
|
<span class="plain">{-segment:Char.i6t}</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP4"></a><b>§4. KOV Support. </b>See the "BlockValues.i6t" segment for the specification of the following
|
||
|
routines. Because no block values are ever stored in a text, they can
|
||
|
freely be bitwise copied or forgotten, which is why we need do nothing
|
||
|
special to copy or destroy a text.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Support task arg1 arg2 arg3;</span>
|
||
|
<span class="plain">switch(task) {</span>
|
||
|
<span class="plain">CREATE_KOVS: return TEXT_TY_Create(arg2);</span>
|
||
|
<span class="plain">CAST_KOVS: TEXT_TY_Cast(arg1, arg2, arg3);</span>
|
||
|
<span class="plain">MAKEMUTABLE_KOVS: return TEXT_TY_Mutable(arg1);</span>
|
||
|
<span class="plain">COPYQUICK_KOVS: rtrue;</span>
|
||
|
<span class="plain">COPYSB_KOVS: TEXT_TY_CopySB(arg1, arg2);</span>
|
||
|
<span class="plain">KINDDATA_KOVS: return 0;</span>
|
||
|
<span class="plain">EXTENT_KOVS: return TEXT_TY_Extent(arg1);</span>
|
||
|
<span class="plain">COMPARE_KOVS: return TEXT_TY_Compare(arg1, arg2);</span>
|
||
|
<span class="plain">READ_FILE_KOVS: if (arg3 == -1) rtrue;</span>
|
||
|
<span class="plain">return TEXT_TY_ReadFile(arg1, arg2, arg3);</span>
|
||
|
<span class="plain">WRITE_FILE_KOVS: return TEXT_TY_WriteFile(arg1);</span>
|
||
|
<span class="plain">HASH_KOVS: return TEXT_TY_Hash(arg1);</span>
|
||
|
<span class="plain">DEBUG_KOVS: TEXT_TY_Debug(arg1);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">! We choose not to respond to: DESTROY_KOVS, COPYKIND_KOVS, COPY_KOVS</span>
|
||
|
<span class="plain">rfalse;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP5"></a><b>§5. Debugging. </b>This shows the various forms a text's short block can take:
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Debug txt;</span>
|
||
|
<span class="plain">switch (txt-->0) {</span>
|
||
|
<span class="plain">CONSTANT_PACKED_TEXT_STORAGE: print " = cp~", (PrintI6Text) txt-->1, "~";</span>
|
||
|
<span class="plain">CONSTANT_PERISHABLE_TEXT_STORAGE: print " = cp~", (PrintI6Text) txt-->1, "~";</span>
|
||
|
<span class="plain">PACKED_TEXT_STORAGE: print " = p~", (PrintI6Text) txt-->1, "~";</span>
|
||
|
<span class="plain">UNPACKED_TEXT_STORAGE: print " = ~", (TEXT_TY_Say) txt, "~";</span>
|
||
|
<span class="plain">default: print " broken?";</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP6"></a><b>§6. Creation. </b>A newly created text is a two-word short block with no long block, like this:
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">Array ThisIsAText --> PACKED_TEXT_STORAGE EMPTY_TEXT_PACKED;</span>
|
||
|
<span class="plain">[ TEXT_TY_Create short_block x;</span>
|
||
|
<span class="plain">return BlkValueCreateSB2(short_block, PACKED_TEXT_STORAGE, EMPTY_TEXT_PACKED);</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP7"></a><b>§7. Copy Short Block. </b>When a short block for a constant is copied, the new copy isn't a constant
|
||
|
any more.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_CopySB to_bv from_bv;</span>
|
||
|
<span class="plain">BlkValueCopySB2(to_bv, from_bv);</span>
|
||
|
<span class="plain">if (to_bv-->0 & BLK_BVBITMAP_CONSTANTMASK) to_bv-->0 = PACKED_TEXT_STORAGE;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP8"></a><b>§8. Transmutation. </b>What happens if a text is stored in packed form, but we need to access or
|
||
|
change its individual characters? The answer is that we have to "transmute"
|
||
|
it into long block form. Sometimes this is a permanent change, but often
|
||
|
it's only temporary, and will soon be followed by an un-transmutation.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Transmute txt;</span>
|
||
|
<span class="plain">TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_Temporarily_Transmute txt x;</span>
|
||
|
<span class="plain">if ((txt) && (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0)) {</span>
|
||
|
<span class="plain">x = txt-->1; ! The old value was a packed string</span>
|
||
|
|
||
|
<span class="plain">txt-->0 = UNPACKED_TEXT_STORAGE;</span>
|
||
|
<span class="plain">txt-->1 = FlexAllocate(32, TEXT_TY, TEXT_TY_Storage_Flags);</span>
|
||
|
<span class="plain">if (x ~= EMPTY_TEXT_PACKED) TEXT_TY_CastPrimitive(txt, false, x);</span>
|
||
|
|
||
|
<span class="plain">return x;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">return 0;</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_Untransmute txt pk cp x;</span>
|
||
|
<span class="plain">if ((pk) && (txt-->0 == UNPACKED_TEXT_STORAGE)) {</span>
|
||
|
<span class="plain">x = txt-->1; ! The old value was an unpacked string</span>
|
||
|
<span class="plain">FlexFree(x);</span>
|
||
|
<span class="plain">txt-->0 = cp;</span>
|
||
|
<span class="plain">txt-->1 = pk; ! The value earlier returned by TEXT_TY_Temporarily_Transmute</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">return txt;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP9"></a><b>§9. Mutability. </b>That neatly handles the question of how to make a text mutable. (Note that
|
||
|
constants are never created in unpacked form.)
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Mutable txt;</span>
|
||
|
<span class="plain">if (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) {</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(txt);</span>
|
||
|
<span class="plain">return 0;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">return 2; ! Tell BlockValue there's a long block pointer</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP10"></a><b>§10. Casting. </b>In general computing, "casting" is the process of translating data in one
|
||
|
type into semantically equivalent data in another: the only interesting
|
||
|
cast here is that a snippet can be turned into a text.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Cast to_txt from_kind from_value;</span>
|
||
|
<span class="plain">if (from_kind == TEXT_TY) {</span>
|
||
|
<span class="plain">BlkValueCopy(to_txt, from_value);</span>
|
||
|
<span class="plain">} else if (from_kind == SNIPPET_TY) {</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(to_txt);</span>
|
||
|
<span class="plain">TEXT_TY_CastPrimitive(to_txt, true, from_value);</span>
|
||
|
<span class="plain">} else BlkValueError("impossible cast to text");</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ SNIPPET_TY_to_TEXT_TY to_txt snippet;</span>
|
||
|
<span class="plain">return BlkValueCast(to_txt, SNIPPET_TY, snippet);</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP11"></a><b>§11. Data Conversion. </b>We use a single routine to handle two kinds of format translation: a
|
||
|
packed I6 string into an unpacked text, or a snippet into an unpacked text.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">In each case, what we do is simply to print out the value we have, but with
|
||
|
the output stream set to memory rather than the screen. That gives us the
|
||
|
character by character version, neatly laid out in an array, and all we have
|
||
|
to do is to copy it into the text and add a null termination byte.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">What complicates things is that the two virtual machines handle printing
|
||
|
to memory quite differently, and that the original text has unpredictable
|
||
|
length. We are going to try printing it into the array <code class="display"><span class="extract">TEXT_TY_Buffers</span></code>,
|
||
|
but what if the text is too big? Disastrously, the Z-machine simply
|
||
|
writes on in memory, corrupting all subsequent arrays and almost certainly
|
||
|
causing the story file to crash soon after. There is nothing we can do
|
||
|
to predict or avoid this, or to repair the damage: this is why the Inform
|
||
|
documentation warns users to be wary of using text with large
|
||
|
strings in the Z-machine, and advises the use of Glulx instead. Glulx
|
||
|
does handle overruns safely, and indeed allows us to dynamically allocate
|
||
|
memory as necessary so that we can always avoid overruns entirely.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">Constant TEXT_TY_NoBuffers = 2;</span>
|
||
|
|
||
|
<span class="plain">#ifdef TARGET_ZCODE;</span>
|
||
|
<span class="plain">Array TEXT_TY_Buffers -> TEXT_TY_BufferSize*TEXT_TY_NoBuffers; ! Where characters are bytes</span>
|
||
|
<span class="plain">#ifnot;</span>
|
||
|
<span class="plain">Array TEXT_TY_Buffers --> (TEXT_TY_BufferSize+2)*TEXT_TY_NoBuffers; ! Where characters are words</span>
|
||
|
<span class="plain">#endif;</span>
|
||
|
|
||
|
<span class="plain">Global RawBufferAddress = TEXT_TY_Buffers;</span>
|
||
|
<span class="plain">Global RawBufferSize = TEXT_TY_BufferSize;</span>
|
||
|
|
||
|
<span class="plain">Global TEXT_TY_CastPrimitiveNesting = 0;</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP12"></a><b>§12. Z Version. </b>The two versions of this routine, one for each virtual machine, are in all
|
||
|
important respects the same, but there are enough fiddly differences that
|
||
|
it's clearer to give two definitions, so:
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">#ifdef TARGET_ZCODE;</span>
|
||
|
<span class="plain">[ TEXT_TY_CastPrimitive to_txt from_snippet from_value len news buffer;</span>
|
||
|
<span class="plain">if (to_txt == 0) BlkValueError("no destination for cast");</span>
|
||
|
<span class="plain">SuspendRTP();</span>
|
||
|
<span class="plain">buffer = RawBufferAddress + TEXT_TY_CastPrimitiveNesting*TEXT_TY_BufferSize;</span>
|
||
|
<span class="plain">TEXT_TY_CastPrimitiveNesting++;</span>
|
||
|
<span class="plain">if (TEXT_TY_CastPrimitiveNesting > TEXT_TY_NoBuffers)</span>
|
||
|
<span class="plain">FlexError("ran out with too many simultaneous text conversions");</span>
|
||
|
|
||
|
<span class="plain">@push say__p; @push say__pc;</span>
|
||
|
<span class="plain">ClearParagraphing(6);</span>
|
||
|
<span class="plain">@output_stream 3 buffer;</span>
|
||
|
<span class="plain">if (from_value) {</span>
|
||
|
<span class="plain">if (from_snippet) print (PrintSnippet) from_value;</span>
|
||
|
<span class="plain">else print (PrintI6Text) from_value;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">@output_stream -3;</span>
|
||
|
<span class="plain">@pull say__pc; @pull say__p;</span>
|
||
|
<span class="plain">ResumeRTP();</span>
|
||
|
|
||
|
<span class="plain">len = buffer-->0;</span>
|
||
|
<span class="plain">if (len > RawBufferSize-1) len = RawBufferSize-1;</span>
|
||
|
<span class="plain">buffer->(len+2) = 0;</span>
|
||
|
|
||
|
<span class="plain">TEXT_TY_CastPrimitiveNesting--;</span>
|
||
|
<span class="plain">BlkValueMassCopyFromArray(to_txt, buffer+2, 1, len+1);</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP13"></a><b>§13. Glulx Version. </b></p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">#ifnot; ! TARGET_ZCODE</span>
|
||
|
<span class="plain">[ TEXT_TY_CastPrimitive to_txt from_snippet from_value</span>
|
||
|
<span class="plain">len i stream saved_stream news buffer buffer_size memory_to_free results;</span>
|
||
|
|
||
|
<span class="plain">if (to_txt == 0) BlkValueError("no destination for cast");</span>
|
||
|
|
||
|
<span class="plain">buffer_size = (TEXT_TY_BufferSize + 2)*WORDSIZE;</span>
|
||
|
|
||
|
<span class="plain">RawBufferSize = TEXT_TY_BufferSize;</span>
|
||
|
<span class="plain">buffer = RawBufferAddress + TEXT_TY_CastPrimitiveNesting*buffer_size;</span>
|
||
|
<span class="plain">TEXT_TY_CastPrimitiveNesting++;</span>
|
||
|
<span class="plain">if (TEXT_TY_CastPrimitiveNesting > TEXT_TY_NoBuffers) {</span>
|
||
|
<span class="plain">buffer = VM_AllocateMemory(buffer_size); memory_to_free = buffer;</span>
|
||
|
<span class="plain">if (buffer == 0)</span>
|
||
|
<span class="plain">FlexError("ran out with too many simultaneous text conversions");</span>
|
||
|
<span class="plain">}</span>
|
||
|
|
||
|
<span class="plain">if (unicode_gestalt_ok) {</span>
|
||
|
<span class="plain">SuspendRTP();</span>
|
||
|
<span class="plain">.RetryWithLargerBuffer;</span>
|
||
|
<span class="plain">saved_stream = glk_stream_get_current();</span>
|
||
|
<span class="plain">stream = glk_stream_open_memory_uni(buffer, RawBufferSize, filemode_Write, 0);</span>
|
||
|
<span class="plain">glk_stream_set_current(stream);</span>
|
||
|
|
||
|
<span class="plain">@push say__p; @push say__pc;</span>
|
||
|
<span class="plain">ClearParagraphing(7);</span>
|
||
|
<span class="plain">if (from_snippet) print (PrintSnippet) from_value;</span>
|
||
|
<span class="plain">else print (PrintI6Text) from_value;</span>
|
||
|
<span class="plain">@pull say__pc; @pull say__p;</span>
|
||
|
|
||
|
<span class="plain">results = buffer + buffer_size - 2*WORDSIZE;</span>
|
||
|
<span class="plain">glk_stream_close(stream, results);</span>
|
||
|
<span class="plain">if (saved_stream) glk_stream_set_current(saved_stream);</span>
|
||
|
<span class="plain">ResumeRTP();</span>
|
||
|
|
||
|
<span class="plain">len = results-->1;</span>
|
||
|
<span class="plain">if (len > RawBufferSize-1) {</span>
|
||
|
<span class="plain">! Glulx had to truncate text output because the buffer ran out:</span>
|
||
|
<span class="plain">! len is the number of characters which it tried to print</span>
|
||
|
<span class="plain">news = RawBufferSize;</span>
|
||
|
<span class="plain">while (news < len) news=news*2;</span>
|
||
|
<span class="plain">i = VM_AllocateMemory(news*WORDSIZE);</span>
|
||
|
<span class="plain">if (i ~= 0) {</span>
|
||
|
<span class="plain">if (memory_to_free) VM_FreeMemory(memory_to_free);</span>
|
||
|
<span class="plain">memory_to_free = i;</span>
|
||
|
<span class="plain">buffer = i;</span>
|
||
|
<span class="plain">RawBufferSize = news;</span>
|
||
|
<span class="plain">buffer_size = (RawBufferSize + 2)*WORDSIZE;</span>
|
||
|
<span class="plain">jump RetryWithLargerBuffer;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">! Memory allocation refused: all we can do is to truncate the text</span>
|
||
|
<span class="plain">len = RawBufferSize-1;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">buffer-->(len) = 0;</span>
|
||
|
|
||
|
<span class="plain">TEXT_TY_CastPrimitiveNesting--;</span>
|
||
|
<span class="plain">BlkValueMassCopyFromArray(to_txt, buffer, 4, len+1);</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">RunTimeProblem(RTP_NOGLULXUNICODE);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (memory_to_free) VM_FreeMemory(memory_to_free);</span>
|
||
|
<span class="plain">];</span>
|
||
|
<span class="plain">#endif;</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP14"></a><b>§14. Comparison. </b>This is more or less <code class="display"><span class="extract">strcmp</span></code>, the traditional C library routine for comparing
|
||
|
strings, but it does pose a few interesting questions. The answers are:
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<ul class="items"><li>(a) Two different unexpanded texts with substitutions are never equal, so
|
||
|
"[X]" and "[Y]" aren't equal as texts even if X and Y are equal.
|
||
|
</li><li>(b) Otherwise we test the current value of the text as expanded, so "[X]"
|
||
|
and "17" can be equal as texts if X is 17.
|
||
|
</li></ul>
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Compare left_txt right_txt rv;</span>
|
||
|
<span class="plain">@push say__comp;</span>
|
||
|
<span class="plain">say__comp = true;</span>
|
||
|
<span class="plain">rv = TEXT_TY_Compare_Inner(left_txt, right_txt);</span>
|
||
|
<span class="plain">@pull say__comp;</span>
|
||
|
<span class="plain">return rv;</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_Compare_Inner left_txt right_txt</span>
|
||
|
<span class="plain">pos ch1 ch2 capacity_left capacity_right fl fr cl cr cpl cpr;</span>
|
||
|
<span class="plain">if (left_txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) fl = true;</span>
|
||
|
<span class="plain">if (right_txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) fr = true;</span>
|
||
|
|
||
|
<span class="plain">if (fl && fr) {</span>
|
||
|
<span class="plain">if ((left_txt-->1 ofclass String) && (right_txt-->1 ofclass String))</span>
|
||
|
<span class="plain">return left_txt-->1 - right_txt-->1;</span>
|
||
|
<span class="plain">if ((left_txt-->1 ofclass Routine) && (right_txt-->1 ofclass Routine))</span>
|
||
|
<span class="plain">return left_txt-->1 - right_txt-->1;</span>
|
||
|
<span class="plain">cpl = left_txt-->0; cl = TEXT_TY_Temporarily_Transmute(left_txt);</span>
|
||
|
<span class="plain">cpr = right_txt-->0; cr = TEXT_TY_Temporarily_Transmute(right_txt);</span>
|
||
|
<span class="plain">} else if (fl) {</span>
|
||
|
<span class="plain">cpl = left_txt-->0; cl = TEXT_TY_Temporarily_Transmute(left_txt);</span>
|
||
|
<span class="plain">} else if (fr) {</span>
|
||
|
<span class="plain">cpr = right_txt-->0; cr = TEXT_TY_Temporarily_Transmute(right_txt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if ((cl) || (cr)) {</span>
|
||
|
<span class="plain">pos = TEXT_TY_Compare(left_txt, right_txt);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(left_txt, cl, cpl);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(right_txt, cr, cpr);</span>
|
||
|
<span class="plain">return pos;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">capacity_left = BlkValueLBCapacity(left_txt);</span>
|
||
|
<span class="plain">capacity_right = BlkValueLBCapacity(right_txt);</span>
|
||
|
<span class="plain">for (pos=0:(pos<capacity_left) && (pos<capacity_right):pos++) {</span>
|
||
|
<span class="plain">ch1 = BlkValueRead(left_txt, pos);</span>
|
||
|
<span class="plain">ch2 = BlkValueRead(right_txt, pos);</span>
|
||
|
<span class="plain">if (ch1 ~= ch2) return ch1-ch2;</span>
|
||
|
<span class="plain">if (ch1 == 0) return 0;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (pos == capacity_left) return -1;</span>
|
||
|
<span class="plain">return 1;</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_Distinguish left_txt right_txt;</span>
|
||
|
<span class="plain">if (TEXT_TY_Compare(left_txt, right_txt) == 0) rfalse;</span>
|
||
|
<span class="plain">rtrue;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP15"></a><b>§15. Hashing. </b>This calculates a hash value for the string, using Bernstein's algorithm.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Hash txt rv len i p cp;</span>
|
||
|
<span class="plain">cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">rv = 0;</span>
|
||
|
<span class="plain">len = BlkValueLBCapacity(txt);</span>
|
||
|
<span class="plain">for (i=0: i<len: i++)</span>
|
||
|
<span class="plain">rv = rv * 33 + BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
|
||
|
<span class="plain">return rv;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP16"></a><b>§16. Printing. </b>Unicode is not the native character set on Glulx: it came along as a late
|
||
|
addition to Glulx's specification. The deal is that we have to explicitly
|
||
|
tell the Glk interface layer to perform certain operations in a Unicode way;
|
||
|
if we simply perform <code class="display"><span class="extract">print (char) ch;</span></code> then the character <code class="display"><span class="extract">ch</span></code> will be
|
||
|
printed in ZSCII rather than Unicode.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Say txt ch i dsize;</span>
|
||
|
<span class="plain">if (txt==0) rfalse;</span>
|
||
|
<span class="plain">if (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) return PrintI6Text(txt-->1);</span>
|
||
|
<span class="plain">dsize = BlkValueLBCapacity(txt);</span>
|
||
|
<span class="plain">for (i=0: i<dsize: i++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">if (ch == 0) break;</span>
|
||
|
<span class="plain">#ifdef TARGET_ZCODE;</span>
|
||
|
<span class="plain">print (char) ch;</span>
|
||
|
<span class="plain">#ifnot; ! TARGET_ZCODE</span>
|
||
|
<span class="plain">@streamunichar ch;</span>
|
||
|
<span class="plain">#endif;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (i == 0) rfalse;</span>
|
||
|
<span class="plain">rtrue;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP17"></a><b>§17. Capitalised printing. </b>It turns out to be useful to have a variation on this:
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Say_Capitalised txt mod rc;</span>
|
||
|
<span class="plain">mod = BlkValueCreate(TEXT_TY);</span>
|
||
|
<span class="plain">TEXT_TY_SubstitutedForm(mod, txt);</span>
|
||
|
<span class="plain">if (TEXT_TY_CharacterLength(mod) > 0) {</span>
|
||
|
<span class="plain">BlkValueWrite(mod, 0, CharToCase(BlkValueRead(mod, 0), 1));</span>
|
||
|
<span class="plain">TEXT_TY_Say(mod);</span>
|
||
|
<span class="plain">rc = true;</span>
|
||
|
<span class="plain">say__p = 1;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueFree(mod);</span>
|
||
|
<span class="plain">return rc;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP18"></a><b>§18. Serialisation. </b>Here we print a serialised form of a text which can later be used
|
||
|
to reconstruct the original text. The printing is apparently to the screen,
|
||
|
but in fact always takes place when the output stream is a file.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">The format chosen is a letter "S" for string, then a comma-separated list
|
||
|
of decimal character codes, ending with the null terminator, and followed by
|
||
|
a semicolon: thus <code class="display"><span class="extract">S65,66,67,0;</span></code> is the serialised form of the text "ABC".
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_WriteFile txt len pos ch p cp;</span>
|
||
|
<span class="plain">cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">len = BlkValueLBCapacity(txt);</span>
|
||
|
<span class="plain">print "S";</span>
|
||
|
<span class="plain">for (pos=0: pos<=len: pos++) {</span>
|
||
|
<span class="plain">if (pos == len) ch = 0; else ch = BlkValueRead(txt, pos);</span>
|
||
|
<span class="plain">if (ch == 0) {</span>
|
||
|
<span class="plain">print "0;"; break;</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">print ch, ",";</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP19"></a><b>§19. Unserialisation. </b>If that's the word: the reverse process, in which we read a stream of
|
||
|
characters from a file and reconstruct the text which gave rise to
|
||
|
them.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_ReadFile txt auxf ch i v dg pos tsize p;</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(txt);</span>
|
||
|
<span class="plain">tsize = BlkValueLBCapacity(txt);</span>
|
||
|
<span class="plain">while (ch ~= 32 or 9 or 10 or 13 or 0 or -1) {</span>
|
||
|
<span class="plain">ch = FileIO_GetC(auxf);</span>
|
||
|
<span class="plain">if (ch == ',' or ';') {</span>
|
||
|
<span class="plain">if (pos+1 >= tsize) {</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(txt, 2*pos) == false) break;</span>
|
||
|
<span class="plain">tsize = BlkValueLBCapacity(txt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(txt, pos++, v);</span>
|
||
|
<span class="plain">v = 0;</span>
|
||
|
<span class="plain">if (ch == ';') break;</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">dg = ch - '0';</span>
|
||
|
<span class="plain">v = v*10 + dg;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(txt, pos, 0);</span>
|
||
|
<span class="plain">return txt;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP20"></a><b>§20. Substitution. </b></p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_SubstitutedForm to txt;</span>
|
||
|
<span class="plain">if (txt) {</span>
|
||
|
<span class="plain">BlkValueCopy(to, txt);</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(to);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">return to;</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_IsSubstituted txt;</span>
|
||
|
<span class="plain">if ((txt) &&</span>
|
||
|
<span class="plain">(txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) &&</span>
|
||
|
<span class="plain">(txt-->1 ofclass Routine)) rfalse;</span>
|
||
|
<span class="plain">rtrue;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP21"></a><b>§21. Perishability. </b>As noted above, a perishable constant is one which must be expanded before
|
||
|
the values it refers to vanish from existence.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_ExpandIfPerishable to from;</span>
|
||
|
<span class="plain">if ((from) && (from-->0 == CONSTANT_PERISHABLE_TEXT_STORAGE))</span>
|
||
|
<span class="plain">return TEXT_TY_SubstitutedForm(to, from);</span>
|
||
|
<span class="plain">return from;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP22"></a><b>§22. Blobs. </b>That completes the compulsory services required for this KOV to function:
|
||
|
from here on, the remaining routines provide definitions of text-related
|
||
|
phrases in the Standard Rules.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">What are the basic operations of text-handling? Clearly we want to be able
|
||
|
to search, and replace, but that is left for the segment "RegExp.i6t"
|
||
|
to handle. More basically we would like to be able to read and write
|
||
|
characters from the text. But texts in I7 tend to be of natural language,
|
||
|
rather than containing arbitrary material — that's indeed why we call them
|
||
|
texts rather than strings. This means they are likely to be punctuated
|
||
|
sequences of words, divided up perhaps into sentences and even paragraphs.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">So we provide facilities which regard a text as being an array of "blobs",
|
||
|
where a "blob" is a unit of text. The user can choose whether to see it
|
||
|
as an array of characters, or words (of three different sorts: see the
|
||
|
Inform documentation for details), or paragraphs, or lines.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">Constant CHR_BLOB = 1; ! Construe as an array of characters</span>
|
||
|
<span class="plain">Constant WORD_BLOB = 2; ! Of words</span>
|
||
|
<span class="plain">Constant PWORD_BLOB = 3; ! Of punctuated words</span>
|
||
|
<span class="plain">Constant UWORD_BLOB = 4; ! Of unpunctuated words</span>
|
||
|
<span class="plain">Constant PARA_BLOB = 5; ! Of paragraphs</span>
|
||
|
<span class="plain">Constant LINE_BLOB = 6; ! Of lines</span>
|
||
|
|
||
|
<span class="plain">Constant REGEXP_BLOB = 7; ! Not a blob type as such, but needed as a distinct value</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP23"></a><b>§23. Blob Access. </b>The following routine runs a small finite-state-machine to count the number
|
||
|
of blobs in a text, using any of the above blob types (except
|
||
|
<code class="display"><span class="extract">REGEXP_BLOB</span></code>, which is used for other purposes). If the optional arguments
|
||
|
<code class="display"><span class="extract">ctxt</span></code> and <code class="display"><span class="extract">wanted</span></code> are supplied, it also copies the text of blob number
|
||
|
<code class="display"><span class="extract">wanted</span></code> (counting upwards from 1 at the start of the text) into the
|
||
|
text <code class="display"><span class="extract">ctxt</span></code>. If the further optional argument <code class="display"><span class="extract">rtxt</span></code> is supplied,
|
||
|
then <code class="display"><span class="extract">ctxt</span></code> is instead written with the original text <code class="display"><span class="extract">txt</span></code> as it would
|
||
|
read if the blob in question were replaced with the text in <code class="display"><span class="extract">rtxt</span></code>.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">Constant WS_BRM = 1;</span>
|
||
|
<span class="plain">Constant SKIPPED_BRM = 2;</span>
|
||
|
<span class="plain">Constant ACCEPTED_BRM = 3;</span>
|
||
|
<span class="plain">Constant ACCEPTEDP_BRM = 4;</span>
|
||
|
<span class="plain">Constant ACCEPTEDN_BRM = 5;</span>
|
||
|
<span class="plain">Constant ACCEPTEDPN_BRM = 6;</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_BlobAccess txt blobtype ctxt wanted rtxt</span>
|
||
|
<span class="plain">p1 p2 cp1 cp2 r;</span>
|
||
|
<span class="plain">if (txt==0) return 0;</span>
|
||
|
<span class="plain">if (blobtype == CHR_BLOB) return TEXT_TY_CharacterLength(txt);</span>
|
||
|
<span class="plain">cp1 = txt-->0; p1 = TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">cp2 = rtxt-->0; p2 = TEXT_TY_Temporarily_Transmute(rtxt);</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
|
||
|
<span class="plain">r = TEXT_TY_BlobAccessI(txt, blobtype, ctxt, wanted, rtxt);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(txt, p1, cp1);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(rtxt, p2, cp2);</span>
|
||
|
<span class="plain">return r;</span>
|
||
|
<span class="plain">];</span>
|
||
|
<span class="plain">[ TEXT_TY_BlobAccessI txt blobtype ctxt wanted rtxt</span>
|
||
|
<span class="plain">brm oldbrm ch i dsize csize blobcount gp cl j;</span>
|
||
|
<span class="plain">dsize = BlkValueLBCapacity(txt);</span>
|
||
|
<span class="plain">if (ctxt) csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">else if (rtxt) "*** rtxt without ctxt ***";</span>
|
||
|
<span class="plain">brm = WS_BRM;</span>
|
||
|
<span class="plain">for (i=0:i<dsize:i++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">if (ch == 0) break;</span>
|
||
|
<span class="plain">oldbrm = brm;</span>
|
||
|
<span class="plain">if (ch == 10 or 13 or 32 or 9) {</span>
|
||
|
<span class="plain">if (oldbrm ~= WS_BRM) {</span>
|
||
|
<span class="plain">gp = 0;</span>
|
||
|
<span class="plain">for (j=i:j<dsize:j++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, j);</span>
|
||
|
<span class="plain">if (ch == 0) { brm = WS_BRM; break; }</span>
|
||
|
<span class="plain">if (ch == 10 or 13) { gp++; continue; }</span>
|
||
|
<span class="plain">if (ch ~= 32 or 9) break;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">if (j == dsize) brm = WS_BRM;</span>
|
||
|
<span class="plain">switch (blobtype) {</span>
|
||
|
<span class="plain">PARA_BLOB: if (gp >= 2) brm = WS_BRM;</span>
|
||
|
<span class="plain">LINE_BLOB: if (gp >= 1) brm = WS_BRM;</span>
|
||
|
<span class="plain">default: brm = WS_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">gp = false;</span>
|
||
|
<span class="plain">if ((blobtype == WORD_BLOB or PWORD_BLOB or UWORD_BLOB) &&</span>
|
||
|
<span class="plain">(ch == '.' or ',' or '!' or '?'</span>
|
||
|
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
|
||
|
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}'))</span>
|
||
|
<span class="plain">gp = true;</span>
|
||
|
<span class="plain">switch (oldbrm) {</span>
|
||
|
<span class="plain">WS_BRM:</span>
|
||
|
<span class="plain">brm = ACCEPTED_BRM;</span>
|
||
|
<span class="plain">if (blobtype == WORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp) brm = SKIPPED_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp) brm = ACCEPTEDP_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">SKIPPED_BRM:</span>
|
||
|
<span class="plain">if (blobtype == WORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp == false) brm = ACCEPTED_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">ACCEPTED_BRM:</span>
|
||
|
<span class="plain">if (blobtype == WORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp) brm = SKIPPED_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp) brm = ACCEPTEDP_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">ACCEPTEDP_BRM:</span>
|
||
|
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp == false) brm = ACCEPTED_BRM;</span>
|
||
|
<span class="plain">else {</span>
|
||
|
<span class="plain">if ((ch == BlkValueRead(txt, i-1)) &&</span>
|
||
|
<span class="plain">(ch == '-' or '.')) blobcount--;</span>
|
||
|
<span class="plain">blobcount++;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">ACCEPTEDN_BRM:</span>
|
||
|
<span class="plain">if (blobtype == WORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp) brm = SKIPPED_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp) brm = ACCEPTEDP_BRM;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">ACCEPTEDPN_BRM:</span>
|
||
|
<span class="plain">if (blobtype == PWORD_BLOB) {</span>
|
||
|
<span class="plain">if (gp == false) brm = ACCEPTED_BRM;</span>
|
||
|
<span class="plain">else {</span>
|
||
|
<span class="plain">if ((ch == BlkValueRead(txt, i-1)) &&</span>
|
||
|
<span class="plain">(ch == '-' or '.')) blobcount--;</span>
|
||
|
<span class="plain">blobcount++;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (brm == ACCEPTED_BRM or ACCEPTEDP_BRM) {</span>
|
||
|
<span class="plain">if (oldbrm ~= brm) blobcount++;</span>
|
||
|
<span class="plain">if ((ctxt) && (blobcount == wanted)) {</span>
|
||
|
<span class="plain">if (rtxt) {</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, cl, 0);</span>
|
||
|
<span class="plain">TEXT_TY_Concatenate(ctxt, rtxt, CHR_BLOB);</span>
|
||
|
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">cl = TEXT_TY_CharacterLength(ctxt);</span>
|
||
|
<span class="plain">if (brm == ACCEPTED_BRM) brm = ACCEPTEDN_BRM;</span>
|
||
|
<span class="plain">if (brm == ACCEPTEDP_BRM) brm = ACCEPTEDPN_BRM;</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">if (cl+1 >= csize) {</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
|
||
|
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">if (rtxt) {</span>
|
||
|
<span class="plain">if (cl+1 >= csize) {</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
|
||
|
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">if ((rtxt) && (brm ~= ACCEPTEDN_BRM or ACCEPTEDPN_BRM)) {</span>
|
||
|
<span class="plain">if (cl+1 >= csize) {</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
|
||
|
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (ctxt) BlkValueWrite(ctxt, cl++, 0);</span>
|
||
|
<span class="plain">return blobcount;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP24"></a><b>§24. Get Blob. </b>The front end which uses the above routine to read a blob. (Note that, for
|
||
|
efficiency's sake, we read characters more directly.)
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_GetBlob ctxt txt wanted blobtype;</span>
|
||
|
<span class="plain">if (txt==0) return;</span>
|
||
|
<span class="plain">if (blobtype == CHR_BLOB) return TEXT_TY_GetCharacter(ctxt, txt, wanted);</span>
|
||
|
<span class="plain">TEXT_TY_BlobAccess(txt, blobtype, ctxt, wanted);</span>
|
||
|
<span class="plain">return ctxt;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP25"></a><b>§25. Replace Blob. </b>The front end which uses the above routine to replace a blob. (Once again,
|
||
|
characters are handled directly to avoid incurring all that overhead.)
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_ReplaceBlob blobtype txt wanted rtxt ctxt ilen rlen i p cp;</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(txt);</span>
|
||
|
<span class="plain">cp = rtxt-->0; p = TEXT_TY_Temporarily_Transmute(rtxt);</span>
|
||
|
<span class="plain">if (blobtype == CHR_BLOB) {</span>
|
||
|
<span class="plain">ilen = TEXT_TY_CharacterLength(txt);</span>
|
||
|
<span class="plain">rlen = TEXT_TY_CharacterLength(rtxt);</span>
|
||
|
<span class="plain">wanted--;</span>
|
||
|
<span class="plain">if ((wanted >= 0) && (wanted<ilen)) {</span>
|
||
|
<span class="plain">if (rlen == 1) {</span>
|
||
|
<span class="plain">BlkValueWrite(txt, wanted, BlkValueRead(rtxt, 0));</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">ctxt = BlkValueCreate(TEXT_TY);</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(ctxt, ilen+rlen+1)) {</span>
|
||
|
<span class="plain">for (i=0:i<wanted:i++)</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, i, BlkValueRead(txt, i));</span>
|
||
|
<span class="plain">for (i=0:i<rlen:i++)</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, wanted+i, BlkValueRead(rtxt, i));</span>
|
||
|
<span class="plain">for (i=wanted+1:i<ilen:i++)</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, rlen+i-1, BlkValueRead(txt, i));</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, rlen+ilen, 0);</span>
|
||
|
<span class="plain">BlkValueCopy(txt, ctxt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueFree(ctxt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">ctxt = BlkValueCreate(TEXT_TY);</span>
|
||
|
<span class="plain">TEXT_TY_BlobAccess(txt, blobtype, ctxt, wanted, rtxt);</span>
|
||
|
<span class="plain">BlkValueCopy(txt, ctxt);</span>
|
||
|
<span class="plain">BlkValueFree(ctxt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(rtxt, p, cp);</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP26"></a><b>§26. Replace Text. </b>This is the general routine which searches for any instance of <code class="display"><span class="extract">ftxt</span></code>,
|
||
|
as a blob, in <code class="display"><span class="extract">txt</span></code>, and replaces it with the text <code class="display"><span class="extract">rtxt</span></code>. It works on
|
||
|
any of the above blob-types, but two cases are special: first, if the
|
||
|
blob-type is <code class="display"><span class="extract">CHR_BLOB</span></code>, then it can do more than search and replace
|
||
|
for any instance of a single character: it can search and replace any
|
||
|
instance of a substring, so that <code class="display"><span class="extract">ftxt</span></code> is not required to be only a
|
||
|
single character. Second, if the blob-type is the special value
|
||
|
<code class="display"><span class="extract">REGEXP_BLOB</span></code> then <code class="display"><span class="extract">ftxt</span></code> is interpreted as a regular expression rather
|
||
|
than something literal to find: see "RegExp.i6t" for what happens next.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_ReplaceText blobtype txt ftxt rtxt</span>
|
||
|
<span class="plain">r p1 p2 cp1 cp2;</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(txt);</span>
|
||
|
<span class="plain">cp1 = ftxt-->0; p1 = TEXT_TY_Temporarily_Transmute(ftxt);</span>
|
||
|
<span class="plain">cp2 = rtxt-->0; p2 = TEXT_TY_Temporarily_Transmute(rtxt);</span>
|
||
|
<span class="plain">r = TEXT_TY_ReplaceTextI(blobtype, txt, ftxt, rtxt);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(ftxt, p1, cp1);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(rtxt, p2, cp2);</span>
|
||
|
<span class="plain">return r;</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_ReplaceTextI blobtype txt ftxt rtxt</span>
|
||
|
<span class="plain">ctxt csize ilen flen i cl mpos ch chm whitespace punctuation;</span>
|
||
|
<span class="plain">if (blobtype == REGEXP_BLOB or CHR_BLOB)</span>
|
||
|
<span class="plain">return TEXT_TY_Replace_RE(blobtype, txt, ftxt, rtxt);</span>
|
||
|
|
||
|
<span class="plain">ilen = TEXT_TY_CharacterLength(txt);</span>
|
||
|
<span class="plain">flen = TEXT_TY_CharacterLength(ftxt);</span>
|
||
|
<span class="plain">ctxt = BlkValueCreate(TEXT_TY);</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
|
||
|
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">mpos = 0;</span>
|
||
|
|
||
|
<span class="plain">whitespace = true; punctuation = false;</span>
|
||
|
<span class="plain">for (i=0:i<=ilen:i++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">.MoreMatching;</span>
|
||
|
<span class="plain">chm = BlkValueRead(ftxt, mpos++);</span>
|
||
|
<span class="plain">if (mpos == 1) {</span>
|
||
|
<span class="plain">switch (blobtype) {</span>
|
||
|
<span class="plain">WORD_BLOB:</span>
|
||
|
<span class="plain">if ((whitespace == false) && (punctuation == false)) chm = -1;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">whitespace = false;</span>
|
||
|
<span class="plain">if (ch == 10 or 13 or 32 or 9) whitespace = true;</span>
|
||
|
<span class="plain">punctuation = false;</span>
|
||
|
<span class="plain">if (ch == '.' or ',' or '!' or '?'</span>
|
||
|
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
|
||
|
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}') {</span>
|
||
|
<span class="plain">if (blobtype == WORD_BLOB) chm = -1;</span>
|
||
|
<span class="plain">punctuation = true;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (ch == chm) {</span>
|
||
|
<span class="plain">if (mpos == flen) {</span>
|
||
|
<span class="plain">if (i == ilen) chm = 0;</span>
|
||
|
<span class="plain">else chm = BlkValueRead(txt, i+1);</span>
|
||
|
<span class="plain">if ((blobtype == CHR_BLOB) ||</span>
|
||
|
<span class="plain">(chm == 0 or 10 or 13 or 32 or 9) ||</span>
|
||
|
<span class="plain">(chm == '.' or ',' or '!' or '?'</span>
|
||
|
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
|
||
|
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}')) {</span>
|
||
|
<span class="plain">mpos = 0;</span>
|
||
|
<span class="plain">cl = cl - (flen-1);</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, cl, 0);</span>
|
||
|
<span class="plain">TEXT_TY_Concatenate(ctxt, rtxt, CHR_BLOB);</span>
|
||
|
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">cl = TEXT_TY_CharacterLength(ctxt);</span>
|
||
|
<span class="plain">continue;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">mpos = 0;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (cl+1 >= csize) {</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;</span>
|
||
|
<span class="plain">csize = BlkValueLBCapacity(ctxt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, cl++, ch);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueCopy(txt, ctxt);</span>
|
||
|
<span class="plain">BlkValueFree(ctxt);</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP27"></a><b>§27. Character Length. </b>When accessing at the character-by-character level, things are much easier
|
||
|
and we needn't go through any finite state machine palaver.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_CharacterLength txt ch i dsize p cp r;</span>
|
||
|
<span class="plain">if (txt==0) return 0;</span>
|
||
|
<span class="plain">cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">dsize = BlkValueLBCapacity(txt); r = dsize;</span>
|
||
|
<span class="plain">for (i=0:i<dsize:i++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">if (ch == 0) { r = i; break; }</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
|
||
|
<span class="plain">return r;</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_Empty txt;</span>
|
||
|
<span class="plain">if (txt==0) rtrue;</span>
|
||
|
<span class="plain">if (txt-->0 & BLK_BVBITMAP_LONGBLOCKMASK == 0) {</span>
|
||
|
<span class="plain">if (txt-->1 == EMPTY_TEXT_PACKED) rtrue;</span>
|
||
|
<span class="plain">rfalse;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (TEXT_TY_CharacterLength(txt) == 0) rtrue;</span>
|
||
|
<span class="plain">rfalse;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP28"></a><b>§28. Get Character. </b>Characters in a text are numbered upwards from 1 by the users of this
|
||
|
routine: which is why we subtract 1 when reading the array in the
|
||
|
block-value, which counts from 0.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_GetCharacter ctxt txt i ch p cp;</span>
|
||
|
<span class="plain">if (txt==0) return 0;</span>
|
||
|
<span class="plain">cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
|
||
|
<span class="plain">if ((i<=0) || (i>TEXT_TY_CharacterLength(txt))) ch = 0;</span>
|
||
|
<span class="plain">else ch = BlkValueRead(txt, i-1);</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, 0, ch);</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, 1, 0);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
|
||
|
<span class="plain">return ctxt;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP29"></a><b>§29. Casing. </b>In many programming languages, characters are a distinct data type from
|
||
|
strings, but not in I7. To I7, a character is simply a text which
|
||
|
happens to have length 1 — this has its inefficiencies, but is conceptually
|
||
|
easy for the user.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph"><code class="display"><span class="extract">TEXT_TY_CharactersOfCase(txt, case)</span></code> determines whether all the characters in <code class="display"><span class="extract">txt</span></code>
|
||
|
are letters of the given casing: 0 for lower case, 1 for upper case. In the
|
||
|
case of ZSCII, this is done correctly handling all of the European accented
|
||
|
letters; in the case of Unicode, it follows the Unicode standard.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">Note that there is no requirement for <code class="display"><span class="extract">txt</span></code> to be only a single character
|
||
|
long.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_CharactersOfCase txt case i ch len p cp r;</span>
|
||
|
<span class="plain">if (txt==0) return 0;</span>
|
||
|
<span class="plain">cp = txt-->0; p = TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">len = TEXT_TY_CharacterLength(txt);</span>
|
||
|
<span class="plain">r = true;</span>
|
||
|
<span class="plain">for (i=0:i<len:i++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">if ((ch) && (CharIsOfCase(ch, case) == false)) { r = false; break; }</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(txt, p, cp);</span>
|
||
|
<span class="plain">return r;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP30"></a><b>§30. Change Case. </b>We set <code class="display"><span class="extract">ctxt</span></code> to the text in <code class="display"><span class="extract">txt</span></code>, except that all the letters are
|
||
|
converted to the <code class="display"><span class="extract">case</span></code> given (0 for lower, 1 for upper). The definition
|
||
|
of what is a "letter", what case it has and what the other-case form is
|
||
|
are as specified in the ZSCII and Unicode standards.
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_CharactersToCase ctxt txt case i ch len bnd pk cp;</span>
|
||
|
<span class="plain">if (txt==0) return 0;</span>
|
||
|
<span class="plain">cp = txt-->0; pk = TEXT_TY_Temporarily_Transmute(txt);</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(ctxt);</span>
|
||
|
<span class="plain">len = TEXT_TY_CharacterLength(txt);</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(ctxt, len+1)) {</span>
|
||
|
<span class="plain">bnd = 1;</span>
|
||
|
<span class="plain">for (i=0:i<len:i++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(txt, i);</span>
|
||
|
<span class="plain">if (case < 2) {</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, i, CharToCase(ch, case));</span>
|
||
|
<span class="plain">} else {</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, i, CharToCase(ch, bnd));</span>
|
||
|
<span class="plain">if (case == 2) {</span>
|
||
|
<span class="plain">bnd = 0;</span>
|
||
|
<span class="plain">if (ch == 0 or 10 or 13 or 32 or 9</span>
|
||
|
<span class="plain">or '.' or ',' or '!' or '?'</span>
|
||
|
<span class="plain">or '-' or '/' or '"' or ':' or ';'</span>
|
||
|
<span class="plain">or '(' or ')' or '[' or ']' or '{' or '}') bnd = 1;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">if (case == 3) {</span>
|
||
|
<span class="plain">if (ch ~= 0 or 10 or 13 or 32 or 9) {</span>
|
||
|
<span class="plain">if (bnd == 1) bnd = 0;</span>
|
||
|
<span class="plain">else {</span>
|
||
|
<span class="plain">if (ch == '.' or '!' or '?') bnd = 1;</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(ctxt, len, 0);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(txt, pk, cp);</span>
|
||
|
<span class="plain">return ctxt;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<p class="inwebparagraph"><a id="SP31"></a><b>§31. Concatenation. </b>To concatenate two texts is to place one after the other: thus "green"
|
||
|
concatenated with "horn" makes "greenhorn". In this routine, <code class="display"><span class="extract">from_txt</span></code>
|
||
|
would be "horn", and is added at the end of <code class="display"><span class="extract">to_txt</span></code>, which is returned in
|
||
|
its expanded state.
|
||
|
</p>
|
||
|
|
||
|
<p class="inwebparagraph">When the blob type is <code class="display"><span class="extract">REGEXP_BLOB</span></code>, the routine is used not for simple
|
||
|
concatenation but to handle the concatenations occurring when a regular
|
||
|
expression search-and-replace is going on: see "RegExp.i6t".
|
||
|
</p>
|
||
|
|
||
|
|
||
|
<pre class="display">
|
||
|
<span class="plain">[ TEXT_TY_Concatenate to_txt from_txt blobtype ref_txt</span>
|
||
|
<span class="plain">p cp r;</span>
|
||
|
<span class="plain">if (to_txt==0) rfalse;</span>
|
||
|
<span class="plain">if (from_txt==0) return to_txt;</span>
|
||
|
<span class="plain">TEXT_TY_Transmute(to_txt);</span>
|
||
|
<span class="plain">cp = from_txt-->0; p = TEXT_TY_Temporarily_Transmute(from_txt);</span>
|
||
|
<span class="plain">r = TEXT_TY_ConcatenateI(to_txt, from_txt, blobtype, ref_txt);</span>
|
||
|
<span class="plain">TEXT_TY_Untransmute(from_txt, p, cp);</span>
|
||
|
<span class="plain">return r;</span>
|
||
|
<span class="plain">];</span>
|
||
|
|
||
|
<span class="plain">[ TEXT_TY_ConcatenateI to_txt from_txt blobtype ref_txt</span>
|
||
|
<span class="plain">pos len ch i tosize x y case;</span>
|
||
|
<span class="plain">switch(blobtype) {</span>
|
||
|
<span class="plain">CHR_BLOB, 0:</span>
|
||
|
<span class="plain">pos = TEXT_TY_CharacterLength(to_txt);</span>
|
||
|
<span class="plain">len = TEXT_TY_CharacterLength(from_txt);</span>
|
||
|
<span class="plain">if (BlkValueSetLBCapacity(to_txt, pos+len+1) == false) return to_txt;</span>
|
||
|
<span class="plain">for (i=0:i<len:i++) {</span>
|
||
|
<span class="plain">ch = BlkValueRead(from_txt, i);</span>
|
||
|
<span class="plain">BlkValueWrite(to_txt, i+pos, ch);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">BlkValueWrite(to_txt, len+pos, 0);</span>
|
||
|
<span class="plain">return to_txt;</span>
|
||
|
<span class="plain">REGEXP_BLOB:</span>
|
||
|
<span class="plain">return TEXT_TY_RE_Concatenate(to_txt, from_txt, blobtype, ref_txt);</span>
|
||
|
<span class="plain">}</span>
|
||
|
<span class="plain">print "*** TEXT_TY_Concatenate used on impossible blob type ***^";</span>
|
||
|
<span class="plain">rfalse;</span>
|
||
|
<span class="plain">];</span>
|
||
|
</pre>
|
||
|
|
||
|
<p class="inwebparagraph"></p>
|
||
|
|
||
|
<hr class="tocbar">
|
||
|
<ul class="toc"><li><a href="S-tt.html">Back to 'Tables Template'</a></li><li><a href="S-ut.html">Continue with 'UnicodeData Template'</a></li></ul><hr class="tocbar">
|
||
|
<!--End of weave-->
|
||
|
</body>
|
||
|
</html>
|
||
|
|