Building Working Models of Full Natural-Language Understanding in Limited Pragmatic Domains

James A. Mason - 2010 May 17, 20, 26; June 8; 2012 Aug 16; 2012 Sep 1

Keywords: English language understanding , natural-language processing , NLP , NLU , computational linguistics , dialog system , Java , Augmented Syntax Diagram , ASD , playing card

My long-term research project now is to build working models that understand English as an English-speaker does, in realistic pragmatic domains.  Of course, for the foreseeable future, such models will require pragmatic domains which are restricted to ones that can be modeled completely on a computer.  Nevertheless, limited pragmatic domains can permit us to explore and model thoroughly many detailed syntactic and semantic structures of English, most of which structures should generalize well to less-limited pragmatic domains.  In particular, we should be able to model most of the syntax and semantics of the so-called "function words" of English -- articles and other determiners in noun phrases, conjuctions, and prepositions -- as contrasted with the "content words" -- nouns, adjectives, verbs and adverbs.  Function words are also sometimes referred to as "closed class" words, belonging to syntactic classes to which new words are almost never added to the language.  Content words are sometimes referred to as "open class" words, belonging to syntactic classes to which new words are frequently added.

I have chosen ordinary playing cards as the basis for a first pragmatic domain for which to build models of English-language understanding.  That domain is simple enough to be modeled fairly easily in computer software, yet it is rich enough to allow exploration of many syntactic and semantic features of English.  I am building a succession of models of English-language understanding for that domain, which I call CardWorld.  The first two implementations are CardWorld1 and CardWorld2, which are available from this web site in both compiled and open-source form.  The latest model can also be run from this link as a Java Web Start applet created by Roxanne Parent.  The CardWorld models can be used with various kinds of input, including stylus and touch-screen pointing, and English input by keyboard as well as spoken English input using a program like Dragon Naturally Speaking as a front-end.

Documentation for the first two versions CardWorld is provided in CardWorld1Documentation.html and CardWorld2Documentation.html .  It should be noted that, for setting and getting values of semantic feature variables, CardWorld1 and CardWorld2 use only the basic tools provided by ASDParser and ASDDecider.  They do not require use of the SemanticValue class hierarchy.

CardWorld models can be extended in many directions, including these:


Permit other operations on playing cards and collections of cards:
Introduce additional agents into a card world, give them different views of that world, and allow various kinds of communication among them.


Add semantic structures required for the extended pragmatics, including
Extensions like those may need the SemanticValue class hierarchy and more.


Add vocabulary and grammar structures required for the extended pragmatics and semantics, including
All such syntactic extensions can be accomplished with ASDEditor and ASDParser.