Class EncodedWord

java.lang.Object
  extended by Word
      extended by EncodedWord
All Implemented Interfaces:
java.lang.Comparable

public class EncodedWord
extends Word

A class for words encoded as (a) a String representing the word, (b) an int representing the frequency of the word in the target language, and (c) a String representing the numeric code for the word, as per the telephone keypad.


Field Summary
 
Fields inherited from class Word
freq, word
 
Constructor Summary
EncodedWord(java.lang.String s)
          Construct an EncodedWord object.
EncodedWord(java.lang.String s, int n)
          Construct an EncodedWord object.
EncodedWord(java.lang.String s, int n, java.lang.String cde)
          Construct an EncodedWord object.
 
Method Summary
static char encodeLetter(char c)
          Encode a character as per a phone keypad.
static java.lang.String encodeWord(java.lang.String s1)
          Encode a word as per a phone keypad.
static java.lang.String encodeWordMultitap(java.lang.String word)
          Return a string containing the Multitap code (i.e., keystrokes) for the given word.
 java.lang.String getCode()
          Returns a String representing the code for this EncodedWord.
 int getFreq()
          Returns an int representing the frequency of this EncodedWord.
static java.lang.String[] getMostFrequent(int n, EncodedWord[] dict)
          Returns a string array of the highest-frequency words in an EncodedWord dictionary.
static java.lang.String[] getQuery(java.lang.String code, EncodedWord[] dict)
          Returns a string array containing a query.
static java.lang.String[] getTentative(java.lang.String code, EncodedWord[] dict)
          Returns a String array (sorted by frequency) containing words tentatively matching the given code.
static java.lang.String[] getUnique(java.lang.String code, EncodedWord[] dict)
          Returns an array of words (sorted by frequency) matching the given code.
 java.lang.String getWord()
          Returns a String representing the word portion of this EncodedWord.
static EncodedWord[] loadCodedDictionary(java.lang.String fileName)
          Load a coded dictionary file into an EncodedWord array.
static void main(java.lang.String[] args)
          Test the EncodedWord class.
static void printEncodedWord(EncodedWord[] ew)
          Prints an array of EncodedWord objects.
static void printStringArray(java.lang.String[] s)
          Print a string array.
 void setCode(java.lang.String cde)
          Assigns a code to this this EncodedWord.
 java.lang.String toString()
          Returns a String representation of this EncodedWord
 
Methods inherited from class Word
compareTo, equals, incFreq, loadDictionary
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

EncodedWord

public EncodedWord(java.lang.String s,
                   int n)
Construct an EncodedWord object. If only the word and its frequency are passed as arguments. the code field is initialized as per a phone keypad. For example, the word "lazy" is assigned the code "5299".

Parameters:
s - a String representing a word.
n - an int representing the frequency of the word in the language.

EncodedWord

public EncodedWord(java.lang.String s,
                   int n,
                   java.lang.String cde)
Construct an EncodedWord object. If a code is also passed as an argument, it will be use for the code field of the EncodedWord. For example, for T9 encoding, the code "5299N" might be used for the word "lazy", or, for Multitap encoding, the code "55529999N999" might be used.

Parameters:
s - a String representing a word.
n - an int representing the frequency of the word in the language.
cde - a String representing the code for the word for the interaction technique of interest.

EncodedWord

public EncodedWord(java.lang.String s)
Construct an EncodedWord object. If only word is passed, the frequency is initialized with zero, and the code is initialized as per the phone keypad. For example, "lazy" is initialized with frequency = 0, and code = "5299".

Parameters:
s - a String representing a word.
Method Detail

getWord

public java.lang.String getWord()
Returns a String representing the word portion of this EncodedWord.

Overrides:
getWord in class Word

getFreq

public int getFreq()
Returns an int representing the frequency of this EncodedWord.

Overrides:
getFreq in class Word

getCode

public java.lang.String getCode()
Returns a String representing the code for this EncodedWord.


setCode

public void setCode(java.lang.String cde)
Assigns a code to this this EncodedWord.

Parameters:
cde - a String representing the code to assign to this EncodedWord. For example the code "5299N" might be used to assign the standard T9 code to the word "lazy".

toString

public java.lang.String toString()
Returns a String representation of this EncodedWord

Overrides:
toString in class Word

encodeWord

public static java.lang.String encodeWord(java.lang.String s1)
Encode a word as per a phone keypad. The argument is assumed to contain only letters. Non letters are encoded as '#'. A space is returned as a space.

Parameters:
s1 - a String representing a word (or phrase) to encode
Returns:
a String representing the word encoded as per a phone keypad. For example "lazy" is returned as "5299".

encodeLetter

public static char encodeLetter(char c)
Encode a character as per a phone keypad.

Parameters:
c - a char to encode
Returns:
a char representing the numeric key on a phone keypad representing the given character. For example, 'f' is returned as '3'.

loadCodedDictionary

public static EncodedWord[] loadCodedDictionary(java.lang.String fileName)
                                         throws java.io.IOException
Load a coded dictionary file into an EncodedWord array. A coded dictionary file contains a series of lines, each containing three white-space delimited entries: a word, the frequency of the word in the target language, and the code (viz. keystrokes) used to entered the word using the intended interaction technique.

As an example of a dictionary file, here are the first few lines from the file d1-wordfreq-ks.txt:

     the   5776384   843
     of    2789403   63
     and   2421302   263
     a     1939617   2
     in    1695860   46
     to    1468146   86
 
After loading the dictionary, the entries are sorted "by code" in ascending order.

Parameters:
fileName - the name of the coded dictionary file
Returns:
a EncodedWord array.
Throws:
java.io.IOException

getUnique

public static java.lang.String[] getUnique(java.lang.String code,
                                           EncodedWord[] dict)
Returns an array of words (sorted by frequency) matching the given code.

Parameters:
code - a String representing a word encoded as per the telephone keypad (e.g., "5299")
dict - an array of EncodedWord objects (i.e., a dictionary)
Returns:
a string array of words matching the code. The array is ordered with the most probable matches first. For example, if code = "5299", { "jazz", "lazy" } is returned. Returns a null reference if no matches.


getTentative

public static java.lang.String[] getTentative(java.lang.String code,
                                              EncodedWord[] dict)
Returns a String array (sorted by frequency) containing words tentatively matching the given code.

Parameters:
code - a String representing a word encoded as per the telephone keypad
dict - an array of EncodedWord objects (i.e., the dictionary)
Returns:
a string array of words tentatively matching the code. "Tentative" implies that the words in the returned array begin with a pattern of letters matching the given code. For example, if code = "5299", the returned array might be { "laywers", "lawyer", "lazy", "jazz" }

getMostFrequent

public static java.lang.String[] getMostFrequent(int n,
                                                 EncodedWord[] dict)
Returns a string array of the highest-frequency words in an EncodedWord dictionary. The returned string array is sorted in descending order with the highest frequency word first.

Parameters:
n - the number of words to return
dict - an array of EncodedWord objects
Returns:
a String array (e.g., { "the", "of", "and" } )

getQuery

public static java.lang.String[] getQuery(java.lang.String code,
                                          EncodedWord[] dict)
Returns a string array containing a query. Returns null if no entries in query. A query consists of an array of words beginning with those returned by getUnique(), followed by those returned by getTentative(). Duplicates are removed.

Parameters:
code - a String containing the numeric code for a word (e.g., "5299")
dict - an array of EncodedWord objects
Returns:
a String array (e.g., { "jazz", "lazy" } )

printEncodedWord

public static void printEncodedWord(EncodedWord[] ew)
Prints an array of EncodedWord objects.

Parameters:
ew - an array of EncodedWord objects

printStringArray

public static void printStringArray(java.lang.String[] s)
Print a string array.


encodeWordMultitap

public static java.lang.String encodeWordMultitap(java.lang.String word)
Return a string containing the Multitap code (i.e., keystrokes) for the given word.

Parameters:
word - a word to encode as per Multitap keystrokes.
Returns:
a string containing the Multitap keytrokes required to enter the given word.

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Test the EncodedWord class.

Throws:
java.io.IOException