MacKenzie, I. S., & Buxton, W. (1993). A tool for the rapid evaluation of input devices using Fitts' law models. SIGCHI Bulletin, 25(3), 58-63. [software]
A Tool for the Rapid Evaluation of Input Devices Using Fitts' Law Models
I. Scott MacKenzie1 and William Buxton2
1Dept. of Computing & Information Science
University of Guelph
Guelph, Ontario, Canada, N1G 2W12Computer Systems Research Institute
University of Toronto
Toronto, Ontario, Canada M5S 1A4
Abstract
A tool for building Fitts' law models is described. MODEL BUILDER runs on the Apple Macintosh using any device which connects to the Apple Desktop Bus. After 16 blocks of trials taking about 4-5 minutes, the program provides an immediate (albeit tentative) statistical analysis, showing the coefficients in the prediction equation, the coefficient of correlation, and a regression line with scatter points. MODEL BUILDER can be retrieved anonymously by researchers, educators, developers, or anyone with access to INTERNET through file-transfer-protocol (ftp).
INTRODUCTION
Evaluating the performance of different computer input devices on routine tasks has inspired a substantial body of research in HCI. (See Greenstein & Arnaut, 1988; Milner, 1988; or Thomas & Milan, 1987 for reviews.) One focus is in developing predictive models useful outside the experimental regime. Such models have the potential to predict the performance of device-task combinations before products are finalized, or to aid in exploring what-if scenarios for human-computer interfaces.
In this paper, we will describe MODEL BUILDER, a program that builds Fitts' law prediction models for any input device that can connect to the Apple Macintosh. One objective herein is, in a simplistic sense, to "publish" MODEL BUILDER - to make the program available to anyone with electronic mail access to INTERNET. The program resides on a file server at the University of Guelph and may be retrieved using file-transfer-protocol (ftp).
There is no shortage of utility programs in the public domain; but tools specifically aimed at research are generally not offered. There are three objectives in publishing Model Builder in this way. First, it seems an appropriate (but neglected) use of technology to facilitate the ability of others to duplicate or test our results. Second, it permits others to extend our research to other devices and conditions in such a way that results are more easily compared. Third, we are interested in contributing to HCI education by making the tool available to students and instructors.
We call MODEL BUILDER a "rapid" evaluation tool since it immediately performs a statistical analysis on performance data and provides a linear regression prediction equation, coefficient of correlation, and a plot of the regression line with scatter points. The number of trials upon which the evaluation is based defaults to 160 but is user-selectable through a setup screen. Data files are created and saved permitting in-depth, follow-up analyses across multiple devices, trial blocks, subjects, or any other experimental condition of interest.
INTRODUCTION TO FITTS' LAW
The emerging prevalence of direct manipulation interfaces has resulted in a paradigm shift for modelling user performance. For primitive tasks at least, keystroke models are of diminishing relevance. "Movement" models, on the other hand, may be closer to the underlying processes which determine and limit performance. As a cross disciplinary study, HCI can now add to its constituent fields, kinematics, psychomotor behaviour, kinaesthetics, and other disciplines previously remote from research directed toward optimizing human-computer interfaces.
One of the most robust and highly adopted models of human movement is Fitts' law (Fitts, 1954). Fitts' information processing model has been used widely in previous HCI research and holds considerable promise as a tool for design (Card, Mackinlay, & Robertson, 1990; MacKenzie, in press, 1992; Marchionini & Sibert, 1991; Newell & Card, 1985). Typical examples of the law in HCI research include Boritz, Booth, & Cowan (1991); Card, English, and Burr (1978); Gillan, Holden, Adam, Rudisill, and Magee (1990); MacKenzie, Sellen, and Buxton (1991); Walker and Smelcer (1990); and Ware and Mikaelian (1987).
The following paragraphs briefly summarize Fitts' law. For detailed reviews, see Buxton (in press), MacKenzie (1992), or Meyer, Smith, Kornblum, Abrams, and Wright (1990).
According to Fitts' law, the time (MT) to move to and select a target of width W which lies at distance (or amplitude) A is
MT = a + b log2(2A / W) (1) where a and b are constants determined through linear regression. W corresponds to "accuracy" -- the required region where an action terminates. The log term is the index of difficulty (ID) and carries the unit "bits" (because the base is 2). If MT is measured in "seconds", then the unit for a is "seconds" and for b, "seconds/bit". The reciprocal of b is the index of performance (IP) in "bits/second". This is the human rate of information processing for the movement task under investigation.
Variations of the law have been proposed by Welford (1968),
MT = a + b log2(A / W + 0.5), (2) and MacKenzie (1989),
MT = a + b log2(A / W + 1). (3) Equations 1, 2, and 3 differ only in the formulations for ID. On the whole, Equation 3, known as the Shannon formulation, is preferred because it
- provides a slightly better fit with observations,
- exactly mimics the information theorem underlying Fitts' law, and
- always gives a positive rating for the index of task difficulty.
Since IP is in bits/s, it is sometimes called bandwidth. Intuitively, the higher the bandwidth the higher the rate of human performance since more information is being articulated per unit time. One of the strengths in Fitts' law is that measures for IP, or bandwidth, can motivate performance comparisons across factors such as device, limb, or task. It follows that performance in a human-computer interface can be optimized by selecting and combining those conditions yielding high bandwidths.
Unfortunately, substantial theoretical and methodological problems exist in applying Fitts' law, with the result that the potential to actually use prediction models (or metrics such as bandwidth), is seriously compromised (MacKenzie, 1992). An example is error rate. Although a technique exists for normalizing responses to accommodate the speed-accuracy tradeoff, it is rarely applied. The technique, first described by Crossman in 1960 (Welford, 1968, p. 147), calls for target width (W) to be transformed into an effective target width (We) reflecting the spatial variability in subjects' actions. All Fitts' law models built in such a manner carry an inherent, nominal error rate of 4%. Performance metrics such as bandwidth are more accurate and more useful if they encompass both the speed and accuracy of responses. Furthermore, models derived in such a manner can be compared with confidence that the differences found are due to inherent properties in devices, tasks, etc., rather than to experimental procedures or hidden factors that may have induced behaviour at different points on the speed-accuracy continuum.
MODEL BUILDER
Apparatus
MODEL BUILDER is implemented on an Apple Macintosh. A Fitts' law model can be built for any device that can connect to the Apple Desktop Bus (ADB) port. In addition, devices can also be tested which use RS232 or other interfaces accompanied by custom drivers enabled through the Apple's Control Panel. We have tested numerous devices connected to the ADB port and a Wacom tablet (Model SD42X) which connects to the RS232 modem port (MacKenzie et al., 1991).
Task Paradigm and Design
The task is an implementation of Fitts' reciprocal tapping task using a computer input device and a CRT display. A typical screen is shown in Figure 1. At the beginning of a block, two rectangular targets appear along with a cross-hair cursor which is maneuvered by manipulating the input device. A thick arrow appears below one of the targets to improves the SR compatibility of the task and to synchronize subjects with the software. (This becomes important if gross errors occur, such as selecting twice on the same target.) The arrow points to the target to be selected next. As selections are made, the arrow moves from target to target guiding the subject through the block of trials.
Figure 1. A typical screen showing Fitts' reciprocal tapping task
(point-and-select) as implemented for a CRT display and any input device.
The design mimics that employed by Fitts (1954) in his original experiments using a stylus. The distance between the targets (the amplitude, A) and the width of the targets (W) each vary over four levels with
A = {64, 126, 256, 512} pixels, and W = {8, 16, 32, 64} pixels.
The easiest condition has A = 64 and W = 64 for a task difficulty of
ID = log2(64 / 64 + 1) = 1.00 bit. The hardest condition has A = 512 and W = 8 for a task difficulty of
ID = log2(512 / 8 + 1) = 6.02 bits. The sixteen A-W conditions are presented in random order with a block of trials performed at each condition. The default is 10 trials per block. At the end of each block, the screen goes blank for about 1 second and then the next condition appears. A selection outside the target is considered an error and is accompanied by a beep.
Setup Screen
Upon launching the application, a setup screen appears as shown in Figure 2. Several characteristics of the session can be set at this time. The default of ten trials per block can be set to any number using a scroll bar or by clicking on the current setting and entering a new value. The device type, set using radio buttons, is written to the data file, but otherwise has no bearing on the program. A task option is also available to select a point-and-select task (CLICK) or a grab-and-drag task (DRAG). For grab-and-drag, an object appears within the target and is acquired with a button-down action; the object is dragged to the other target and then released with a button-up action.
Figure 2. Set-up screen. Several parameters can be set to control the
characteristics of the session and the format of the output file. The GO
button initiates a block of trials.The TESTCLICK and TESTDRAG buttons initiate a single block of warmup trials and return immediately to the setup screen. Anything entered in the SUBJECT or COMMENTS box is written to the output data file. The GO button initiates the sixteen blocks of trials for model building.
Results
After each session of 16 blocks, a statistical analysis appears as shown in Figure 3. Six models are calculated using each of the Fitts, Welford, and Shannon formulations with and without normalizing target width measures for the spatial variability in responses. One of the models, selected by the six buttons at the top of the display, appears in the centre of the display as a scatter plot and regression line. The un-normalized Shannon model is shown in Figure 3.
Figure 3. The results screen. A scatter plot and regression line are
displayed using one of six possible Fitts' law models. The Shannon model using
un-normalized values for target width (W) is displayed.All six models are given at the bottom of the display. The coefficients shown include the intercept and slope of the regression line and the correlation. Figure 4 is the same except the normalized Shannon model has been selected for display. The prediction model for this plot is
MT = -68 + 141 log2(A/We + 1). (4)
Figure 4. The results screen (as in Figure 3) showing the Shannon model
with normalized values for target width.We should mention that the technique for normalizing is applied at the model building stage only. The transformation of target width (W) into the effective target width (We) is based on the spatial variability in a block of trials. In applying the model to predict the time for a single trial, it is W that is used in the prediction equation. Using a model built with normalized measures implies that subsequent predictions carry a 4% probability that the target will be missed (MacKenzie, 1992; Welford, 1968, p. 147).
The bandwidth associated with the prediction model in Equation 4 is the reciprocal of the slope coefficient, namely
1 / 141 = 7.1 bit/s. As shown in Figures 3 and 4 the correlations increased slightly from the Fitts to the Welford to the Shannon formulations; however, the correlations on the whole were lower using normalized measures. Since the sample is small and based on a single subject, we shall not dwell further on the differences in correlations. Note, however, that in a re-analysis of Fitts' (1954) data the correlations were slightly higher when the regression models were built using normalized data (MacKenzie, 1989, 1992).
Output Data File
A portion of the output data file is shown in Figure 5. Most of the header information is self-explanatory. A Session is a single series of 16 blocks of trials. The Session entry automatically increments for each session and is appended to the file name to distinguish among multiple sessions if used. For example, the data file associated with Figure 5 is named John-mouse-click-1.
Figure 5. A portion of the data file created by MODEL BUILDER
======================================== Experiment: Fitts' Law Click Experiment Comment: Subject: John Device: mouse Task: click Session: 1 Number of Blocks: 16 Trials per block: 10 Block A= 64 W= 8 58 -4.5 -22 40 0.5 -23 43 -1.5 -24 53 2.5 -27 35 -0.5 -26 45 -1.5 -26 44 1.5 -26 62 2.5 -26 56 -0.5 -26 100 -3.5 -30 Block A= 64 W= 16 27 -0.5 -14 **** etc. **** ========================================For each trial three measurements are recorded: the time (in clock ticks, 1 / 60 s), and the X and Y pixel coordinates of selection. The first trial took (58 / 60) × 1000 = 967 ms. The Y coordinate is saved but it is not used. The X coordinate is normalized about the centre of the target. A negative value indicates the selection was on the "inside" of target-centre, toward the centre of the screen. Since the width of each target was an even number of pixels, no pixel corresponded to the precise centre of the target; hence, the ".5" weighting for each X coordinate. The occurrence of an error was not explicitly saved, but is easily determined from the X coordinate and the target width. With W = 8 in the first block in Figure 5, selections were on-target if -3.5 £ X £ = +3.5, or off-target otherwise. As evident in Figure 5, one error occurred in the first block of trials for this session.
INTERNET/Anonymous ftp Access
Users with access to INTERNET can retrieve MODEL BUILDER using file-transfer-protocol (ftp) to copy files to their own system. A typical session is shown in Figure 6 with the user's input in boldface type. Additional messages will appear that are not shown in the example.
Figure 6. A typical session to retrieve MODEL BUILDER.
========================================================== % ftp snowhite.cis.uoguelph.ca !connect to our archive name: anonymous password: name@site !enter your name and site ftp> cd pub/fitts-law !change directory ftp> dir !mb.Hqx should appear ftp> get mb.Hqx !retrieve model builder ftp> get README ftp> quit ==========================================================After retrieval, Binhex or a related utility must be used to convert MODEL BUILDER from a hex file to a binary application. Some additional information may be found in a README file which should also be retrieved.
CONCLUSION
MODEL BUILDER is simple to use and can be explored by HCI educators in teaching about performance models such as Fitts' law. The immediate appearance of a statistical analysis is of tremendous value to students who are about to embark on experiments of their own. Certainly, caution should be excersized in drawing any conclusions on the single-subject, single-session statistics presented. There is no substitute for rigour in statistical analysis, and any serious attempt to use MODEL BUILDER should by-pass the tentative analysis provided and work with the data files saved after each session.
ACKNOWLEDGEMENT
We would like to acknowledge the contribution of Pavel Rozalski who wrote the software, and the members of the Input Research Group at the University of Toronto.
This research was supported by the Natural Sciences and Engineering Research Council of Canada, Xerox Palo Alto Research Center, Digital Equipment Corp., and Apple Computer Inc. We gratefully acknowledge this contribution, without which, this work would not have been possible.
REFERENCES
Boritz, J., Booth, K. S., & Cowan, W. B. (1991). Fitts's law studies of directional mouse movement. Proceedings of Graphics Interface `91 (pp. 216-223). Toronto: Canadian Information Processing Society.
Buxton, W. (in press). Haptic input to computer systems. Cambridge, UK: Cambridge University Press.
Card, S. K., English, W. K., & Burr, B. J. (1978). Evaluation of mouse, rate-controlled isometric joystick, step keys, and text keys for text selection on a CRT. Ergonomics, 21, 601-613.
Card, S. K., Mackinlay, J. D., & Robertson, G. G. (1990). The design space of input devices. Proceedings of the CHI '90 Conference on Human Factors in Computing Systems (pp. 117-124). New York: ACM.
Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47, 381-391.
Greenstein, J. S., & Arnaut, L. Y. (1988). Input devices. In M. Helander (Ed.), Handbook of human-computer interaction (pp. 495-519). Amsterdam: Elsevier.
Gillan, D. J., Holden, K., Adam, S., Rudisill, M., & Magee, L. (1990). How does Fitts' law fit pointing and dragging? Proceedings of the CHI '90 Conference on Human Factors in Computing Systems (pp. 227-234). New York: ACM.
MacKenzie, I. S. (1989). A note on the information-theoretic basis for Fitts' law. Journal of Motor Behavior, 21, 323-330.
MacKenzie, I. S. (1992). Fitts' law as a research and design tool in human-computer interaction. Human-Computer Interaction, 7, 91-139.
MacKenzie, I. S. (in press). Movement time prediction in human-computer interfaces. Proceedings of Graphics Interface `92. Toronto: Canadian Information Processing Society.
MacKenzie, I. S., Sellen, A., & Buxton, W. (1991). A comparison of input devices in elemental pointing and dragging tasks. Proceedings of the CHI '91 Conference on Human Factors in Computing Systems (pp. 161-166). New York: ACM.
Marchionini, G., & Sibert, J. (1991). An agenda for human-computer interaction: Science and engineering serving human needs. SIGCHI Bulletin, 23(4), 17-32.
Meyer, D. E., Smith, J. E. K., Kornblum, S., Abrams, R. A., & Wright, C. E. (1990). Speed-accuracy tradeoffs in aimed movements: Toward a theory of rapid voluntary action. In M. Jeannerod (Ed.), Attention and performance XIII (pp. 173-226). Hillsdale, NJ: Erlbaum.
Milner, N. P. (1988). A review of human performance and preferences with different input devices to computer systems. In D. Jones & R. Winder (Eds.), People and Computers IV: Proceedings of the Fourth Conference of the British Computer Society -- Human-Computer Interaction Group (pp. 341-362). Cambridge, UK: Cambridge University Press.
Newell, A., & Card, S. K. (1985). The prospects for psychological science in human-computer interaction. Human-Computer Interaction, 1, 209-242.
Thomas, C., & Milan, S. (1987). Which input device should be used with interactive video? In B. Shackel (Ed.), Human-Computer Interaction -- INTERACT `87 (pp. 587-592). Amsterdam: Elsevier.
Walker, N., & Smelcer, J. B. (1990). A comparison of selection times from walking and pull-down menus. Proceedings of the CHI `90 Conference on Human Factors in Computing Systems (pp. 221-225). New York: ACM.
Ware, C., & Mikaelian, H. H. (1989). A evaluation of an eye tracker as a device for computer input. Proceedings of the CHI+GI '87 Conference on Human Factors in Computing Systems and Graphics Interface (pp. 183-188). New York: ACM.
Welford, A. T. (1968). Fundamentals of skill. London: Methuen.