words

A library of code for reading natural language into a stream of words. This is version 1.

Chapter 1: Setting Up

Building on the foundation module.
- Words Module - Setting up the use of this module.
Chapter 2: Words in Isolation

Recognising different words, and storing phrases made of them.
- Vocabulary - To classify the words in the lexical stream, where two different words are considered equivalent if they are unquoted and have the same text, taken case insensitively.
- Word Assemblages - To manage arbitrary assemblies of vocabulary, if a little slowly.
Chapter 3: Words in Sequence

Reading in arbitrary text and breaking it into a numbered sequence of words.
- Lexer - To break down a stream of characters into a numbered sequence of words, literal strings and literal I6 inclusions, removing comments and unnecessary whitespace.
- Wordings - To manage contiguous word ranges.
- Text From Files - This is where source text is read in, whether from extension files or from the main source text file, and fed into the lexer.
- Feeds - Feeds are conduits for arbitrary text to flow into the lexer, and to be converted into wordings.
- Identifiers - To represent snippets of natural language in identifier form, which a typical C-like compiler would accept.
Chapter 4: Parsing
- Numbered Words - Some utilities for handling single words referred to by number.
- Preform - To read in structural definitions of natural language written in a meta-language called Preform.
- Basic Nonterminals - A handful of bare minimum Preform syntax.