A SemanticValue class hierarchy for use with ASD grammars

last revision 2011 Oct 11 (minor correction of 2005 May 14 version)

Introduction

ASD grammars permit augmentations to grammar nodes which are names of Java functions that perform arbitrary computations.  That allows ASD grammars to have maximum flexibility.  However, for many purposes it is useful to have a more specific, though restricted, semantic representation.  Such a representation, implemented as a class hierarchy in Java, is described here.  The representation permits the creation of objects which consist of any number of feature-value pairs: that is, strings representing feature names and associated objects representing feature values. 

The root of the class hierarchy is a new abstract class, SemanticValue, which provides functions for updating and displaying semantic representations that consist of fixed numbers of named semantic features and their values. Such feature-value pairs can be called semantic parameters, whose settings determine the basic meaning which an instance of a subclass of SemanticValue represents.

For example, the meanings of English ordinal phrases such as "third", "next", "previous", "last", "next to last", "fourth from last", "previous but one", "next but two" and the like can be represented by semantic parameters which describe a starting position in a sequence (origin = "first", origin = "last", or origin = "current"),  a direction in which to count from the starting position (direction = 1, meaning forward, or direction = -1, meaning backward), and an integral position to which to count.  Thus "third" can be represented by [
origin = "first", direction = 1, position = 3].  Other examples are:
Similarly, the meanings of English vague quantifiers such as "few", "little", "many", "much", and "a lot" can be represented by semantic parameters which specify a position on an ordinal size scale,  a parameter indicating whether the discreteness of the quantity is "discrete", "mass", or unspecified, and a parameter indicating whether the grammatical number of the phrase describing the quantity is "singular", "plural", or unspecified.  Examples are:
A subclass of SemanticValue, ModifiableSemantics, adds functions for creating and manipulating variable numbers of named semantic features and values.  Such feature-value pairs can be called semantic modifiers, which alter meaning representations that consist of semantic parameters.

Optional modifiers of ordinal phrases or of vague quantity phrases include negatives (e.g. "not next to last",  "not much") and adverbial modifiers which convey aspects of a speaker's attitude (e.g. "probably previous", "surprisingly few").  Variable numbers of such optional modifiers can be contained in a list following  the fixed semantic parameters for a phrase.  The following are examples,  in which syntactic subphrase groupings are shown by parentheses, semantic representations are shown in square brackets, and the words all in capitals in the semantic representations stand for values on an ordinal scale:

"probably last but one"
    (1)    Syntax: (probably (last but one))
            Semantics: [origin = "last", direction = -1, position = 2; (probability =  MODERATELY_HIGH) ]

"certainly not first":
    (1)    Syntax: (certainly (not first))
            Semantics: [origin = "first",
                              direction = 1,
                              position = 1;
                              (
probability = HIGHEST_POSSIBLE, negative = true) ]
   
(2)    Syntax: ((certainly not) first)
            Semantics: [origin = "first",
                              direction
= 1,
                              position = 1;
                              (
probability =  LOWEST_POSSIBLE) ]
Notice that in alternative (2) the meaning of the subphrase "certainly not" is represented by a probability, not a negative.  It would be semantically equivalent in this case to represent it by a negative.

"possibly not next":
    (1)    Syntax: (possibly (not next))
            Semantics: [origin = "current",
                              direction
= 1,
                              position
= 1;
                              (
probability = MODERATELY_LOW, negative = true) ]
   
(2)    Syntax: ((possibly not) next)
            Semantics: [origin = "current",
                              direction
= 1,
                              position = 1;
                              (
negative = [true;
                                                  
(probability = MODERATELY_LOW)] ) ]
Notice that in alternative (2) the meaning of the subphrase "possibly not" is represented by a negative modified by a probability, rather than by a simple probability.

"surprisingly possibly many":
    (1)    Syntax: (surprisingly (possibly many))
            Semantics: [
size =  LARGE,
                             
discreteness = "discrete",
                             
number = "plural";
                              (surprisingness = HIGH, probability = MODERATELY_LOW) ]
    (2)    Syntax: ((surprisingly possibly) many)
            Semantics:
[size =  LARGE,
                             
discreteness = "discrete",
                             
number = "plural";
                              (
probability = [MODERATELY_LOW;
                                                      (surprisingness = HIGH) ] ) ]

"probably surprisingly little"
    (1)    Syntax: (probably (surprisingly little))
            Semantics:
[size =  SMALL,
                             
discreteness = "mass",
                             
number = "singular";
                              (
probability = HIGH, surprisingness = HIGH) ]
    (2)    Syntax: ((probably surprisingly) little)
            Semantics:
[size =  SMALL,
                             
discreteness = "mass",
                             
number = "singular";
                              (
surprisingness = [HIGH;
                                                            (
probability = HIGH) ] ) ]

SemanticValue class hierarchy with demonstration subclasses, and an example grammar that uses them

The SemanticValue class hierarchy, in this demonstration version, consists of the following classes.  Subclass relationships are shown by indenting, with subclasses listed below their parent classes and indented one level farther to the right:

    SemanticValue
       ModifiableSemantics
         
DegreeSemantics
             GraderSemantics
             ThresholdSemantics
          MagnitudeSemantics
          NegativeSemantics
          OrdinalSemantics
          ProbabilitySemantics
          QuantitySemantics
             QuantityCardinalSemantics
             QuantityThresholdSemantics
             QuantityVagueSemantics
          SurprisingnessSemantics
       ModifierSemantics
       Qualifier

The classes whose names are shown in bold letters are basic to the structure of the general semantic representation.  They have been put in a package named semanticvalues.  The remaining classes are in a package named englishdemo.  They demonstrate how to represent meanings of various English phrase types which are recognized by a grammar, npXdemo.grm, which is the result of merging demonstration versions of the following ASD grammar modules that specify parts of a grammar of English noun phrases::

    cardord.grm
    cardordirp.grm
    grader1.grm
    grader2.grm
    quant-vague.grm
    quantity-np.grm
    quantity-p.grm

The file XDemoxrefs.txt contains cross-referencing information for the phrase types and grammar modules in npXdemo.grm.  It was produced by using the DefinedUsed utility in the package asdx.  (See the Software download page.)  As usual, the grammar modules that make up npXdemo.grm can be viewed both with the ASDEditor utility of package asd and with an ASCII file editor.  The merged file npXdemo.grm, of course, is too large and has too many overlapping syntax diagrams to be viewed easily with the ASDEditor.

The demonstration semantic value classes and grammar are only parts of a much larger semantic class hierarchy and grammar for English noun phrases which has not yet been published.  Some of them, quantity-np.grm in particular, are much smaller than the modules with the same names in the larger grammar.  They have been extracted from it in order to provide a demonstration that is reasonably small but which still illustrates most of the methodology used in the full grammar and semantics.

The class NpXDemoSemantics implements the interface ASDSemantics which is defined in the package asd. It defines all of the member functions for computing the  semanticActions and semanticValues with which nodes in the example grammar npXdemo.grm are augmented.  The example grammar and the corresponding semantics can be tested with a Java driver program NPXDemoTester, which is part of a Java package englishdemo.  Package englishdemo also includes the same EnglishWord class that is in package english.

Classes SemanticValue, ModifiableSemantics, ModifierSemantics, and Qualifier

This section describes the classes in the package semanticvalues, which provides general mechanisms for representing semantic values, whether of English or of other languages.  SemanticValue, the abstract class that is the root of the class hierarchy for representations of semantic values, provides general member functions named modifyBy for assigning values to member variables of instances of its subclasses (other than the subclasses described in this section).  Such member variables are used to represent semantic parameters.   SemanticValue also provides general toString functions for producing displayable representations of its subclass instances as Strings.  Finally, it provides a general clone function so that semantic values of subphrases can be copied rather than shared during parsing, so they don't become corrupted when backtracking occurs during parsing.

ModifiableSemantics extends the class SemanticValue to permit variable numbers of optional semantic feature-value pairs (i.e., semantic modifiers).  It  overrides and extends one of the modifyBy member functions inherited from SemanticValue.  In particular, it extends the member function
   public SemanticValue modifyBy(String modifierName, SemanticValue modifierValue)
so that the function first tries to assign the given modifierValue to a member variable that matches the given modifierName.  If no such member variable exists, the function adds a new pair consisting of the given modifierName and modifierValue to a list of optional semantic modifiers for the instance of a subclass of SemanticValue to which the modifyBy function is applied..  ModifiableSemantics also provides a boolean member variable, isModifiable, whose value is true by default but can be set to false if a particular instance of a subclass of ModifiableSemantics should not be permitted to have semantic modifiers other than its fixed semantic parameters.

Class ModifiableSemantics uses an auxiliary class, ModifierSemantics, to provide nodes for a linked list to contain the optional semantic feature-value pairs.  ModifiableSemantics also provides member functions for adding new feature-value pairs to that list and for retrieving, updating, and removing feature-value pairs from the list.  Class ModifiableSemantics manages the list of feature-value pairs by adding new pairs to the beginning of the list, and by searching for existing pairs by feature name, from the beginning of the list.  (For readers who know LISP, this is similar to the way in which association lists are managed in LISP.)

Class Qualifier provides instances which pair Strings and SemanticValue instances into single units which can be used as values set by semanticAction computations or returned by semanticValue computations  which are associated with ASD grammar nodes.  The String and SemanticValue components of such a pair can then be used as arguments for the modifyBy functions defined in classes SemanticValue and ModifiableSemantics.

Subclasses of SemanticValue which are in the package englishdemo

The subclasses of SemanticValue which are in the package englishdemo are all actually subclasses of the class ModifiableSemantics.  Their instances can represent meanings of specific English words and phrases, and presumably of translationally equivalent words and phrases in natural languages other than English.  Specifically, the semantic classes in the englishdemo package provide models for the meanings of  words and phrases that express numerical quantities, vague quantities, ordinals, and negatives  The package also contains subclasses of ModifiableSemantics to represent meanings of adjectives and adverbs that express degrees and that express speakers' attitudes or judgments such as subjective probabilities and surprisingness. 

The class MagnitudeSemantics provides a general representation for speakers' judgments of size, strength, probability, or the like on an ordinal scale.  It also provides common symbolic names for the positions on the scale.  It can represent either a nine-position open-ended scale with values from 1 (TINY) to 9 (HUGE) or an extended eleven-position closed-ended scale with values from 0 (SMALLEST_POSSIBLE) to 10 (LARGEST_POSSIBLE)..

The class DegreeSemantics has subclasses GraderSemantics and ThresholdSemantics that can represent degrees of descriptors or quantities that can be judged on one-dimensional scales.  Instances of GraderSemantics represent subjective positions of such scales, as expressed by instances of class MagnitudeSemantics.  Examples are the meanings of adverbs like "slightly", "somewhat", "fairly", "moderately", "rather", "quite", "very", "extremely" in phrases like "slightly smaller", "somewhat heavy", "fairly old", "rather many", "quite probably", and "very few".  Instances of ThresholdSemantics represent subjective transition positions on such scales; examples are the meanings of "adequate(ly)", "enough", "sufficient(ly)", "excessive(ly)", and "too" in phrases like "adequately large", "an adequate amount", "old enough", "enough cheaper", "sufficiently many", "a sufficient size", "excessively tall", "an excessive quantity", and "too little".  An instance of ThresholdSemantics can express one of two levels of threshold:  "sufficient" or "excessive"; their negations can be expressed by modifying an instance of ThresholdSemantics with an instance of NegativeSemantics.

Instances of NegativeSemantics can represent the meaning of  "not", as well as the negative component of the meanings of words and phrases like "never", "impossible", "insufficient", "unlikely".  Class NegativeSemantics also allows representation of an extreme negative, as for the meaning of "not at all", as opposed to an ordinary negative.

Class QuantitySemantics is the root class for subclasses that represent discrete or mass quantities.  For syntactic purposes, it also allows for representation of the grammatical number, "singular" or "plural" of words and phrases which express quantities.  The englishdemo package has three subclasses of QuantitySemantics: QuantityCardinalSemantics, whose instances represent non-negative integer quantities (e.g. the meanings of "none", "twenty", "three hundred", 2500), QuantityThresholdSemantics, whose instances represent quantities expressed as thresholds (e.g. "enough"), and QuantityVagueSemantics, whose instances represent quantities judged by the speaker on an ordinal scale represented by an instance of MagnitudeSemantics.  Examples of phrases whose meanings can be represented by instances of QuantityVagueSemantics include "a very few", "some", "a little", "several", "a moderate amount", "fairly many", "very much", "extremely many", and "a huge amount".

Instances of OrdinalSemantics represent the meanings of words and phrases which specify a positive integer position as well as an explicit or implicit starting point and direction on a one-dimensional scale.  Examples of such words and phrases are "first", "last", "next", "previous", "third", "next to last", "fourth [from] last", "previous but one", "next but two", and "second from next".

Class ProbabilitySemantics provides representations for subjective probability judgments, as expressed by words and phrases like "impossible", "[just] possibly", "fairly probable", "probably", "very probable", "almost certain", and "[most] certainly".  It uses an instance of MagnitudeSemantics to represent the probability judgment.   It also provides common symbolic names, including IMPOSSIBLE, UNLIKELY, POSSIBLE, PROBABLE, VERY_PROBABLE, and CERTAIN, for positions on a MagnitudeSemantics scale that represents a probability judgment.

Finally for this example demonstration, class SurprisingnessSemantics provides representations for speakers' subjective judgment of surprisingness (versus  expectedness).   It is just one of a number of similar classes, in a larger semantic model for English, which can represent speakers' attitudes and judgments of subjective qualities.  Not surprisingly, the judgments in instances of SurprisingnessSemantics are expressed by instances of class MagnitudeSemantics.

An example of parsing with npXdemo.grm and constructing a meaning representation using SemanticValue subclasses

Here we consider how the phrase"possibly not next", which was used as an example above, is parsed and how its semantic value is computed..  During a parse of the phrase as a phrase of type ORDIR-P, using the grammar npXdemo.grm, the word "possibly" matches the node (possibly 1) in the grammar (see the grammar module grader1.grm).  That node, being a final node in the grammar invokes its associated semanticValue computation function: grader1_possibly_1_v, which is defined in class NpXDemoSemantics:

   public Object grader1_possibly_1_v()
   {  ProbabilitySemantics result
         = new ProbabilitySemantics(ProbabilitySemantics.POSSIBLE);
      return new Qualifier("probability", result);
   }

That function computes an instance of class Qualifier to become the value of a subphrase of type QUALIFIER-LY which ends at "possibly".  The second component of that Qualifier instance is an instance of class ProbabilitySemantics initialized to represent the meaning of "possibly".  (Its first component, the string "probability", will be used as the name of an optional semantic modifier.)

Then the word "QUALIFIER-LY" matches the node (QUALIFIER-LY 1) in the grammar (see the grammar module grader2.grm).  That node invokes its semanticAction function grader2_QUALIFIER_LY_1 in class NpXDemoSemantics:

   public String grader2_QUALIFIER_LY_1()
   {  set("qualifier", nodeValue());
      return null;
   }

The function sets the value of a semantic feature variable named "qualifier" managed by the ASDParser to be the value of the node (QUALIFIER-LY 1), which is the Qualifier instance created and returned by the grader1_possibly_1_v function.

Continuing to parse the phrase "possibly not next" with the grammar npXdemo.grm and using the semantic computations defined in NpXDemoSemantics ultimately yields a successful parse with the following phrase structure tree, shown sidewise with levels represented by indenting.  (The phrase structure tree was displayed by the program NPXDemoTester that was mentioned earlier.  The asterisk points to the dummy header node at the top level of the phrase structure.)

*->nil nil
   ORDIR-P nil
      QUALIFIER-P 1
         QUALIFIER1 1
            QUALIFIER-LY 1
               possibly 1
            $$ 16
         $$ 21
      ORDIR-P 1
         QUALIFIER-P 1
            QUALIFIER1 1
               NEGQUALIFIER 1
                  NEGATIVE 1
                     not 1
                     $$ 19
                  $$ 17
            $$ 21
         ORDIR-P 1
            ORDIR 1
               next 1
               $$ 5

The phrase structure can also be shown in abbreviated form by the string "(possibly (not next))", with parentheses indicating levels of structure. Its semantic value, as displayed by the toString() function of class SemanticValue, invoked by the program NPXDemoTester, is

englishdemo.OrdinalSemantics:
   direction = 1;
   origin = current;
   position = 1;
   isModifiable = true;
   modifiers = semanticvalues.ModifierSemantics:
      modifierName = probability;
      modifierValue = englishdemo.ProbabilitySemantics:
         judgment = englishdemo.MagnitudeSemantics:
            value = 4;
            isModifiable = true;
         isModifiable = true;
      otherModifiers = semanticvalues.ModifierSemantics:
         modifierName = negative;
         modifierValue = englishdemo.NegativeSemantics:
            negative = true;
            extreme = false;
            isModifiable = true;

In the displayed semantic value, each subclass of the root class semanticvalues.SemanticValue is shown with its member variables listed below it and indented to the right.  In the example, the value 4 under "judgment = englishdemo.MagnitudeSemantics:" represents the meaning of the word "possibly" on a 10-point ordinal scale defined in the class MagnitudeSemantics.  Notice how the member variable modifiers, which is defined in the class semanticvalues.ModifiableSemantics, has as its value a list of  two instances of semanticvalues.ModifierSemantics, one for the meaning of the word "possibly" and the second for the meaning of the word "not".

To trace the functions that compute the semantic value, one can proceed as follows:  For each of the grammar instances which appear in the phrase structure tree resulting from the parse,  look up that grammar instance in npXdemo.grm.  Look up the grammar instances in order from the bottom of the tree (shown at right in the phrase structure diagram above) to the top of the tree (shown at left in the diagram), and from the beginning of the phrase to the end,.  When the grammar instance has been found, find the function names that appear in its semanticAction field (the fifth field) and/or its semanticValue field (the fourth field).  Finally, locate the definitions of those functions in the Java source code  NpXDemoSemantics.java and trace that source code.  Note: Because of the size and modularity of the grammar npXdemo.grm, a naming convention has been used to name the semantic functions with which the grammar is augmented, so that the grammar module from which each function is invoked can be identified:  The name of each function begins with the name of a grammar module (e.g. grader1, grader2, or cardordirp) , and is followed by the word or phrase type and instance number which identify the node in that grammar module from which the function is invoked.  Names that end in "_v" identify functions that perform semanticValue computations at final nodes in the grammar.  Names that do not end in "_v" identify functions that perform semanticAction computations for nodes in the grammar.   A coding convention corresponding to the function-naming convention has been used in NpXDemoSemantics.java:  The definitions of the Java member functions whose names appear in semanticAction or semanticValue fields of grammar nodes are listed in alphabetical order of function names.

Below is a list of the specific function names which appear in grammar nodes that correspond to nodes in the phrase structure tree shown above.  (They include the two functions grader1_possibly_1_v and grader2_QUALIFIER_LY_1 which were discussed earlier.)  They are listed in the order in which the functions were invoked by the ASDParser during bottom-up construction of the phrase structure tree.

    grader1_possibly_1_v          in the semantic value field of node (possibly 1)
    grader2_QUALIFIER_LY_1        in the semantic action field of node (QUALIFIER-LY 1)
    grader2_$$_5_v                in the semantic value field of node ($$ 16) of the
                                     merged grammar npXdemo.grm, which is the same as
                                     node ($$ 5) of the grammar module grader2.grm
    grader2_QUALIFIER1_1          in the semantic action field of node (QUALIFIER1 1)
    grader2_$$_10_v               in the semantic value field of node ($$ 21) of
                                     npXdemo.grm, which is the same as node ($$ 10)
                                     of grammar module grader2.grm

    cardordirp_QUALIFIER_P_1      in the semantic action field of node (QUALIFIER-P 1)
    grader2_not_1                 in the semantic action field of node (not 1)
    grader2_$$_8_v                in the semantic value field node ($$ 19) of
                                     npXdemo.grm, which is the same as node ($$ 8)
                                     of grammar module grader2.grm

    grader2_NEGATIVE_1            in the semantic action field of node (NEGATIVE 1)
    grader2_$$_6_v                in the semantic value field of node ($$ 17) of
                                     npXdemo.grm, which is the same as node ($$ 6)
                                     of grammar module grader2.grm
    grader2_NEGQUALIFIER_1_v      in the semantic value field of node (NEGQUALIFIER 1)
    grader2_QUALIFIER1_1          in the semantic action field of node (QUALIFIER1 1)
    grader2_$$_10_v               in the semantic value field of node ($$ 21) of
                                     npXdemo.grm, which is the same as node ($$ 10)
                                     of grammar module grader2.grm

    cardordirp_QUALIFIER_P_1      in the semantic action field of node (QUALIFIER-P 1)
    cardordirp_next_1             in the semantic action field of node (next 1)
    cardordirp_$$_2_v             in the semantic value field of ($$ 5) of
                                     npXdemo.grm, which is the same as node ($$ 2)
                                     of grammar module cardordirp.grm
    cardordirp_ORDIR_1_v          in the semantic value field of node (ORDIR 1)
    cardordirp_ORDIR_P_1_v        in the semantic value field of node (ORDIR-P 1)
    cardordirp_ORDIR_P_1_v        in the semantic value field of node (ORDIR-P 1)


After the first successful parse, a subsequent parse of the same phrase, "possibly not next", yields an alternative parse shown by the parenthesized string "((possibly not) next)".  The full phrase structure for it is

*->nil nil
   ORDIR-P nil
      QUALIFIER-P 1
         QUALIFIER-P 3
            QUALIFIER1 1
               QUALIFIER-LY 1
                  possibly 1
               $$ 16
            $$ 21
         QUALIFIER1 1
            NEGQUALIFIER 1
               NEGATIVE 1
                  not 1
                  $$ 19
               $$ 17
         $$ 21
      ORDIR-P 1
         ORDIR 1
            next 1
            $$ 5

and the semantic value is

englishdemo.OrdinalSemantics:
   direction = 1;
   origin = current;
   position = 1;
   isModifiable = true;
   modifiers = semanticvalues.ModifierSemantics:
      modifierName = negative;
      modifierValue = englishdemo.NegativeSemantics:
         extreme = false;
         isModifiable = true;
         modifiers = semanticvalues.ModifierSemantics:
            modifierName = probability;
            modifierValue = englishdemo.ProbabilitySemantics:
               judgment = englishdemo.MagnitudeSemantics:
                  value = 4;
                  isModifiable = true;
               isModifiable = true;

Of course, the computation of this semantic value can be traced in the same way as for the first phrase structure and semantic value for "possibly not next" that were found by the ASDParser using the grammar npXdemo.grm.

Usage notes

The Java class files for all of the current classes in the SemanticValue class hierarchy (described above) are contained in the Java archive file englishdemo.jar which is also available in the compressed files englishdemo.zip and englishdemo.tar.gz .  If you save the file englishdemo.jar and the grammar file xpXdemo.grm to a single subdirectory on your computer, then you can run the test program NPXDemoTester by whichever of the following commands is appropriate for your operating system:
    under Microsoft Windows:
    java -cp asddigraphs.jar;englishdemo.jar;. englishdemo/NPXDemoTester

    under versions of Unix:
    java -cp asddigraphs.jar:englishdemo.jar:. englishdemo/NPXDemoTester



end of page