MacKenzie, I. S. (2002). Introduction to this special issue on text entry for mobile computing. Human-Computer Interaction, 17, 141-145.
Introduction to This Special Issue
on Text Entry for Mobile ComputingI. Scott MacKenzie
Department. of Computer Science
York University
The four articles in this special issue of Human-Computer Interaction were submitted in response to a call for papers describing recent research in mobile text entry. Text entry research is by no means new. The first wave, so to speak, was in the 1970s and early 1980s in response to the new role of electronic computers in automating office tasks such as typing, word processing, and document management. Modeling ten-finger touch typing, categorizing typing errors, task analysis, and comparing document editing strategies are some of the themes in this early research. Exemplary references are the edited book by Cooper (1983) and chapters 3-9 in The psychology of human-computer interaction (Card, Moran, & Newell, 1983).
The second wave began with the arrival of pen-based computing in the early 1990s. The automated recognition of hand written text entered with a stylus was the elixir for this new mode of interaction, it seemed. Although the potential of pen-based computing, and mobility in general, was well beyond the recognition of hand-printed symbols, this aspect of the interaction received much of the attention. And the attention was not good. Promises did not meet the expectations of demanding users, and the market suffered significantly for this. Yet products continued to arrive. The single most significant event in pen-based computing was the introduction in 1995, and tremendous success of, the Palm Pilot from Palm Computing, Inc. (now a division of 3Com). For text entry, Palm side-stepped many of the problems with existing handwriting recognition technology. To avoid the segmentation problems inherent in multi-stroke hand printed or cursive text, Palm introduced Graffiti, a commercial realization of the one-stroke-per-symbol technique known as unistrokes (Goldberg & Richardson, 1993). For users wanting to avoid handwritten text entry altogether, the Palm Pilot and other pen-entry devices also included a soft keyboard allowing users to tap on virtual keys, thus directly obtaining the desired symbol.
Besides research efforts in handwriting recognition and soft keyboards, another phenomenon has recently grabbed the attention of users and researchers: text messaging on mobile phones. The ability to discretely, asynchronously, and at very low cost, send a message from one phone to another has proven hugely successful, particularly in Europe. The statistics are remarkable: more than a billion text messages per month! This is particularly surprising in view of the limited capability of the cell phone keypad for text input. Not surprisingly then, numerous researchers and companies are grappling with the problem of improving the text entry technique for mobile phones or other anticipated mobile products supporting similar services.
A review article by Soukoreff and MacKenzie opens this special issue. The article begins with general comments on methodologies for evaluating new text entry techniques, including an elaboration on the differing attention demands between text creation tasks and text copy tasks. Other aspects of evaluation include the need to consider both the novice experience and the expert potential of a proposed technique. Some of the more recent and exciting research in mobile text entry involves the combined use of movement minimization techniques and language modeling in designing an optimising the text entry task. Many factors in this area are noted and elaborated with particular attention to the ambiguity in the telephone keypad where each key encodes either three or four letters. The article then surveys text entry techniques for mobile computing, including numerous key-based and stylus-based techniques.
Emerging from the language theory seeded in Shannon's early work (Shannon, 1951; Shannon & Weaver, 1949), Ward, Blackwell, and MacKay devised a dynamic interface for text entry. In the second article in this special issue, they present Dasher, a text entry technique with an interface driven from continuous two-dimensional gestures that regulate the flow of textual information across the display. The textual information is organized to fully exploit language redundancy, thus affording the means to accelerate the text entry task. For example, having just entered t, a set of follow-on letters progresses across the screen with more display space given to more likely letters, thus facilitating selection of, for example, h, and, subsequently, e, in entering the. Of course, the details are everything, and, in this, Ward and colleagues cover their territory well. They describe mechanisms for correcting errors, and for driving the interface with devices such as a stylus or eye tracker. Refreshing, as well, is their detailed extension of their interface using the Hiragana character set used in Japanese text entry.
In the third article, Zhai, Hunter, and Smith provide a detailed extension to the prediction model of Soukoreff and MacKenzie (Soukoreff & MacKenzie, 1995). The original motivation was MacKenzie and Zhang's description and evaluation of a high performance soft keyboard designed using this model (MacKenzie & Zhang, 1999) . MacKenzie and Zhang designed their keyboard by trial and error. Specifically, the model was embedded in a spreadsheet that included digram probabilities from a language corpus and coefficients for a movement time prediction model. A separate collection of cells contained characters representing a soft keyboard. Formulae in the spreadsheet collected together the various components of the model and produced an estimate of the expert text entry rate for the given soft keyboard. By directly editing the soft keyboard cells, different layouts were tested with a prediction appearing immediately with each edit. MacKenzie and Zhang worked the spreadsheet using simple heuristics to fine tune the layout (e.g., common letters should be clustered together near the centre; push infrequent letters to the edges). Predictions edged upward, and eventually a design was settled on - Opti - and tested in a user study. Zhai and colleagues immediately spotted an opportunity in this methodology. Instead of manually rearranging letters, why not use an algorithm to automatically explore the design space? In their article, they present two quantitative techniques to search for optimized virtual keyboard layouts. The first technique simulates the dynamics of a keyboard with "digraph springs", producing a Hooke keyboard. The second yields a Metropolis keyboard using a random walk algorithm guided by a "Fitts-digraph energy" objective function. The details of these design techniques represent a welcome addition to research in mobile text entry.
In the final article, Hughes, Warren, and Buyukkokten take a completely different approach to modeling the text entry task. By conventional practice, a model is built under a set of conditions that are limited in number, yet are representative of the broader range of conditions under which the model is later applied. Fitts' law is a example of this approach (Fitts, 1954). Typically, the model is built using a representative set of conditions, say, eight different combinations of target width and amplitude, and then is later applied under conditions never actually used in building the model. Such is the generality of Fitts' law that the subsequent predictions are surprisingly accurate. Hughes and colleagues instead built an empirical bi-action table containing an entry for each and every condition. Each entry is an empirical measurement of the time for users to make an action, given a preceding action. Despite the apparent brute-force approach, there is indeed a modeling component to their efforts. During the data-collection stage, the actions bear no linguistic assignment. The actions are simple motor acts, performed repeatedly at peak rates for short durations. Armed with a table of minimum movement times for the action set, they proceeded with the subsequent task of assigning letters to keys and searching for the optimal assignment. Thus, they too arrive at an optimal design but the journey is along a very different path.
As a collection, these articles represent a sample of promising research initiatives in text entry for mobile computing. We hope you find the work both stimulating and suggestive of further avenues to explore in this exciting research domain.
NOTESAcknowledgments. We thank the reviewers who participated in the process of bringing this collection of papers to print, as well as Tom Moran, Editor of Human-Computer Interaction, for suggesting the theme of this special issue.
REFERENCESCard, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum.
Cooper, W. E. (1983). Cognitive aspects of skilled typewriting. New York: Springer-Verlag.
Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47, 381-391.
Goldberg, D., & Richardson, C. (1993). Touch-typing with a stylus. Proceedings of the ACM Conference on Human Factors in Computing Systems - INTERCHI '93, pp. 80-87. New York: ACM.
MacKenzie, I. S., & Zhang, S. X. (1999). The design and evaluation of a high-performance soft keyboard. Proceedings of the ACM Conference on Human Factors in Computing Systems - CHI '99, pp. 25-31. New York: ACM.
Shannon, C. E. (1951). Prediction and entropy of printed English. Bell System Technical Journal, 30, 51-64.
Shannon, C. E., & Weaver, W. (1949). The mathematical theory of communications. Urbana, Il: University of Illinois Press.
Soukoreff, W., & MacKenzie, I. S. (1995). Theoretical upper and lower bounds on typing speeds using a stylus and soft keyboard. Behaviour & Information Technology, 14, 370-379.
ARTICLES IN THIS SPECIAL ISSUEHughes, D., Warren, J., & Buyukkokten, O. (2002). Empirical bi-action tables: A tool for the evaluation and optimization of text input systems. Application I: Stylus keyboards. Human-Computer Interaction, 17, 271-309.
MacKenzie, I. S., & Soukoreff, R. W. (2002). Text entry for mobile computing: Models and methods, theory and practice. Human-Computer Interaction, 17, 147-198.
Ward, D. J., Blackwell, A. F., & MacKay, D. J. C. (2002). Dasher: A gesture-driven data entry interface for mobile computing. Human-Computer Interaction, 17, 199-228.
Zhai, S., Hunter, M., & Smith, B. A. (2002). Performance optimization of virtual keyboards. Human-Computer Interaction, 17, 229-269.