Class WordPredict

java.lang.Object
  extended by WordPredict

public class WordPredict
extends java.lang.Object

Class and demo program for word prediction and word completion.

Invocation:

     PROMPT>java WordPredict file n

     where file = a word+freq dictionary file
           n = maximum number of predicted words
 
Here is an example dialogue: (User input is underlined)
     PROMPT>java WordPredict d1-wordfreq.txt 10
     Dictionary contains 64566 words
     Most frequent 10 words...
     the of and to a in that it is was
     Enter word or word stem...
     t
     the to that this they their there them time two
     th
     the that this they their there them then these than
     the
     the they their there them then these themselves therefore theory
     psy
     psychological psychology psychiatric psychologists
     ja
     january james japan jack japanese jane jacket jackson jan jazz
     ^z
 

Author:
Scott MacKenzie, 2002-2010

Constructor Summary
WordPredict(java.lang.String wordFreqFile)
          Construct a WordPredict object.
 
Method Summary
 int getSizeOfDictionary()
          Return the size of the dictionary.
 java.lang.String[] getWords(java.lang.String stem, int max)
          Get an array of words completing the specified word stem.
static void main(java.lang.String[] args)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WordPredict

public WordPredict(java.lang.String wordFreqFile)
            throws java.io.IOException
Construct a WordPredict object.

A WordPredict object is used for word completion (given a word stem) or word prediction (given an empty word stem).

Parameters:
wordFreqFile - a word-frequency dictionary file, for example, d1-wordfreq.txt.
Throws:
java.io.IOException
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Throws:
java.io.IOException

getWords

public java.lang.String[] getWords(java.lang.String stem,
                                   int max)
Get an array of words completing the specified word stem. The array contains either predicted words, if the word stem is a empty string, or completed words if the word stem contains at least one letter. In either case, the list is sorted by word frequency.

This method includes "discount" word prediction. If the word stem is an empty string, the list is simply the max most frequent words in the dictionary.

Note that the array may be less than max, depending on the word stem and the dictionary.

Parameters:
stem - a word stem
max - maximum number of words to return
Returns:
a string array containing a list of words based on the word stem (see above), or null if there are no matches or if stem is null

getSizeOfDictionary

public int getSizeOfDictionary()
Return the size of the dictionary.

Returns:
the size of the dictioary