mirror of
https://github.com/ganelson/inform.git
synced 2024-07-07 17:44:22 +03:00
177 lines
8.1 KiB
OpenEdge ABL
177 lines
8.1 KiB
OpenEdge ABL
About Sentence Diagrams.
|
|
|
|
Description and examples of the diagrams which this module turns sentences into.
|
|
|
|
@ First, an acknowledgement: the sentence diagrams in this section are generated
|
|
automatically by //linguistics-test//. (This means they are always up to date.)
|
|
If you are interested in using //linguistics// in some context other than Inform,
|
|
//linguistics-test// may be a good starting point.
|
|
|
|
@ Every example sentence in this section was passed in turn to the <sentence>
|
|
nonterminal, and the trees displayed below were the result. For example:
|
|
|
|
= (undisplayed text from Figures/simple-raw.txt)
|
|
|
|
Sentence (1) here made no sense: there was no verb. It was therefore left as
|
|
a single |SENTENCE_NT| node with no children. In all other cases, as in (2),
|
|
there are three children: verb, subject phrase, and object phrase.[1]
|
|
|
|
In this tree notation, indentation shows which nodes are children of which
|
|
others. The node types, such as |SENTENCE_NT|, are in capitals and all end
|
|
in |_NT|. The text leading to the creation of the node then appears in quotes.
|
|
After that are "annotations", written in braces.[2] In sentence (2), we see:
|
|
|
|
(a) The |VERB_NT| node is annotated with its grammatical form -- it is "to be",
|
|
in third person singular, active voice, present tense, and a negative sense --
|
|
and also its semantic meaning -- the equality relationship "is".
|
|
(b) The second |UNPARSED_NOUN_NT| node is annotated with the article used to
|
|
introduce it -- the indefinite article, "a", which could be any of masculine,
|
|
feminine or neuter, could be either nominative or accusative, but is
|
|
certainly singular.
|
|
|
|
[1] Since "to be" is a copular verb, in sentence (2) we really mean "the
|
|
phrase in the object position".
|
|
|
|
[2] Since the 1850s a variety of tree-diagram schemes for sentence structure
|
|
has been proposed: see //Wikipedia -> https://en.wikipedia.org/wiki/Sentence_diagram//.
|
|
These tend to be quite large, with many optional features -- no bad thing when
|
|
the aim is to explain. But our aim is to process, not to illustrate, and
|
|
whereas a typical dependency tree would have nodes for both "not" and "a",
|
|
we use annotations instead. We want fairly flat sentence trees with a simple,
|
|
predictable shape.
|
|
|
|
@ Using <sentence> alone tends to result in a lot of |UNPARSED_NOUN_NT| nodes.
|
|
This is unsatisfying, but useful, because sometimes the meaning of a verb
|
|
affects how those nodes should be parsed further. The idea is that the user
|
|
will traverse the tree and parse the |UNPARSED_NOUN_NT| nodes as needed.
|
|
Calling the function //Nouns::recognise// on such a node will test to see
|
|
if it's a known common or proper noun, and amend it accordingly.
|
|
|
|
The //linguistics-test// program does this automatically, so from here on,
|
|
all examples shown will have that operation done. For example:
|
|
|
|
= (undisplayed text from Figures/simple.txt)
|
|
|
|
Here the two |UNPARSED_NOUN_NT| nodes have been recognised as usages of a
|
|
proper noun, Beth, and a common noun, sailor, respectively, and they are
|
|
annotated with their grammatical usages -- in so far as we can tell. These
|
|
two nouns do not inflect with case in English, but they are both singular.
|
|
|
|
@ Clearly the //linguistics// module needs to know some vocabulary in order
|
|
to do this, and in the test runs displayed in this section, it is using a
|
|
very limited stock of nouns, verbs and prepositions as follows:
|
|
|
|
= (undisplayed text from Figures/vocabulary.txt)
|
|
|
|
We only know that Beth is feminine-gendered and sailor masculine-gendered[1]
|
|
because the vocabulary being used by //linguistics-test// says so. It's
|
|
important to appreciate that although an English reader might twig that
|
|
Beth is a common girl's name, we can't do that.
|
|
|
|
[1] In the grammatical sense that "she" can refer to Beth and "he" to a
|
|
generic identity-unknown sailor. Pronouns in English are a source of real
|
|
sensitivity and if //linguistics// were a module to generate text, rather
|
|
than recognise it, we would take much more care over this. Our interest
|
|
is in grammatical gender, not the assignment of sexes to people.
|
|
|
|
@ So, then, let us start with simple copular sentences -- that is,
|
|
sentences involving the verb "to be", which equate two subjects rather
|
|
than having a subject act upon an object. This is why one "ought to" say
|
|
"The traitor is I" instead of "The traitor is me", although nobody does.
|
|
|
|
= (undisplayed text from Figures/copular.txt)
|
|
|
|
@ Next, regular sentences, that is, those where the verb is not copular
|
|
but instead expresses some relationship between a subject and an object
|
|
which play different roles.
|
|
|
|
= (undisplayed text from Figures/regular.txt)
|
|
|
|
Each |RELATIONSHIP_NT| node expresses that it, and the other term, are
|
|
in some non-copular relation to each other. The annotation gives that
|
|
relation from the point of view of the node, not from the point of view
|
|
of the subject of the sentence. For example, in (4), the subject of the
|
|
sentence (woman) is carried by the object (table), but the |RELATIONSHIP_NT|
|
|
node is for the table, and so the meaning is "carries", not "carried-by".
|
|
|
|
@ Possessive verbs need careful handling because of the wide range of
|
|
meanings they can carry which may not involve ownership as such (cf. French
|
|
"j'ai trente ans", or English "I have mumps"). But syntactically they are
|
|
just like other non-copular verbs, and we parse them as such.
|
|
|
|
= (undisplayed text from Figures/possessive.txt)
|
|
|
|
@ An unusual feature of English is its use of subject-verb inversion:
|
|
|
|
= (undisplayed text from Figures/inversion.txt)
|
|
|
|
It would be easy to auto-fix the inversion in sentence (1), by simply
|
|
swapping the "on the table" and "Ming vase" subtrees over, but we want
|
|
to preserve the distinction because Inform will make some use of it.
|
|
|
|
Sentence (2) here is arguably just plain wrong, but we do very occasionally
|
|
allow that sort of thing in Inform (for e.g. "east of X is south of Y").
|
|
|
|
@ Existential sentences, using the defective subject nounphrase "there", are
|
|
marked with an additional annotation.
|
|
|
|
= (undisplayed text from Figures/there.txt)
|
|
|
|
In sentences (3) and (4) here, the resulting trees are essentially identical
|
|
except for the existential annotation.
|
|
|
|
Note that "there" as an object phrase is also defective, but not considered
|
|
existential (it is more likely an anaphora -- "A woman is there" implies a
|
|
reference to a location already being discussed, whereas "There is a woman"
|
|
does not).
|
|
|
|
@ Two sorts of adverbs are recognised, for certainty and occurrence, and they
|
|
are handled by making additional annotations to the verb node, not by adding
|
|
fresh nodes:
|
|
|
|
= (undisplayed text from Figures/usingadverbs.txt)
|
|
|
|
@ We can also support imperative verbs, with "special meanings" which are
|
|
not necessarily relational, and do not always lead to |RELATIONSHIP_NT|
|
|
subtrees. See //Special Meanings//.
|
|
|
|
= (undisplayed text from Figures/imperatives.txt)
|
|
|
|
@ That shows the full range of what happens with verb nodes. Turning back
|
|
to noun phrases, we can have serial lists:
|
|
|
|
= (undisplayed text from Figures/composite.txt)
|
|
|
|
Note that |AND_NT| nodes always have exactly two children, and that the serial
|
|
comma is allowed but not required.
|
|
|
|
|AND_NT| in conjunction with |RELATIONSHIP_NT| can allow for zeugmas.
|
|
Zeugma is sometimes thought to be rare in English and to be basically a comedy
|
|
effect, as in the famous Flanders and Swann lyric:
|
|
|
|
>> She made no reply, up her mind, and a dash for the door.
|
|
|
|
in which three completely different senses of the same verb are used,
|
|
but in which the verb appears only once. It might seem reasonable just to
|
|
disallow this. Unfortunately, less extreme zeugmas occur all the time:
|
|
|
|
>> The red door is west of the Dining Room and east of the Ballroom.
|
|
|
|
@ Now we introduce pronouns to the mix. These are detected automatically
|
|
by //linguistics//, and exist in nominative and accusative cases in
|
|
English. Note the difference in annotations between "them" and "you",
|
|
for example.
|
|
|
|
= (undisplayed text from Figures/usingpronouns.txt)
|
|
|
|
@ "Callings" use the special syntax "X called Y", which has to be handled
|
|
here in the //linguistics// module so that Y can safely wording which would
|
|
otherwise have a structural meaning. ("Called" is to Inform as the backslash
|
|
character, making letters literal, is to C.)
|
|
|
|
= (undisplayed text from Figures/callings.txt)
|
|
|
|
@ The word "with", often but not always used in conjunction with "kind of":
|
|
|
|
= (undisplayed text from Figures/withs.txt)
|