1. INTRODUCTION

This research explores and extends the use of Fitts' law, a psychological model of human movement, as a performance model for human-computer interaction. Theoretical issues are developed to verify and improve the model's prediction power, with practical extensions proposed to bring the model in line with current modes of user input to computers.

The recent emergence of personal computers with user interfaces employing a desktop metaphor has raised the specter of this model in the relatively new field of human-computer interaction (HCI). Users often interact with today's systems "directly", by manipulating iconic objects on a CRT display using a mouse and one or more buttons. Many operations require pointing and/or dragging. Although pointing operations have been the object of a moderate amount of Fitts' law research, dragging operations (with one exception) have yet to be modelled according to Fitts' law.

The theory underlying Fitts' relationship is sufficiently complex, and the ideas presented here sufficiently subtle, that a thorough analysis of the model is essential before we elaborate on scenarios for modifying and extending the model. We begin with an historical and theoretical foundation, summarizing the model's derivation and Fitts' original experiments. In the wake of consistent departure of observations from predictions, many follow-up studies questioned the validity of the model. An analysis of Fitts' original data highlights these problems with adjustments cited that return the model to the information-theoretic premise upon which it is based. Competing models are presented and compared with Fitts' model. A literature review summarizes the findings of a large body of research in experimental psychology and a somewhat smaller body of research in human factors and human-computer interaction.

Hypotheses of the present research are stated in a form suitable for statistical testing. Finally, three experiments are described in which the necessary data are gathered to test the hypotheses.

1.1 Background

Following the work of Shannon, Wiener, and other information theorists in the 1940s, "information" models of psychological processes emerged with great fanfare in the 1950s (e.g., see Miller, 1953; Pierce, 1961, chap. XII). The terms "probability", "redundancy", "bits", "noise", and "channels" entered the vocabulary of experimental psychologists as they explored the latest technique for measuring and modeling human behaviour. Two surviving models are the Hick-Hyman law for choice reaction time (Hick, 1952; Hyman, 1953), and Fitts' law for the information processing capacity of the human motor system (Fitts, 1954).

In the decades since Fitts' original publication, his relationship, or "law", has proven one of the most robust, highly cited, and widely adopted models to emerge from experimental psychology. Psychomotor studies in diverse settings – from under a microscope to under water – have shown a high correlation between Fitts' measure of task difficulty and the time required to complete a movement task. Kinematics and human factors are two fields that are particularly rich in investigations of human performance using Fitts' analogy.

In the relatively new discipline of human-computer interaction (HCI), there is also an interest in the mathematical modeling and prediction of human performance using an information processing analogy. The starting point in HCI is the work of Card, English, and Burr (1978). In comparing four devices for selecting text on a CRT, Fitts' law provided good movement time prediction for the joystick and mouse. Over 80% of the variation in movement time was accounted for by regression equations. In the subsequent Keystroke-Level Model for predicting user performance times (Card, Moran, & Newell, 1980), Fitts' law was cited as an appropriate tool for predicting pointing time but was omitted from the model in lieu of a constant. The value tP = 1.10 s was derived from the Fitts' law prediction equation in Card et al. (1978) and served as a good approximation for pointing time over the range of conditions employed. Similarly, the Model Human Processor of Card, Moran, and Newell (1983, p. 26) comprises nine "principles of operation". These have been the focus of a substantial body of empirical research leading to a psychological model of the human as an information processor. As the performance model for the human motor processor, Fitts' law, Principle P5, plays a prominent role in the Model Human Processor.

The scheduling of a "Fitts session" at the 1990 conference of the Association for Computing Machinery's Special Interest Group on Computer and Human Interaction (SIGCHI) is evidence of the continued interest in the application of Fitts' model to human input to computers. Even the popular press is recognizing the role of psychology in designing systems that address human needs. A recent "Report on Computers" supplement in the Globe and Mail cites the work of Card, Moran, Newell, Shneiderman and others in taking the guesswork out of design through empirical research (Holmes, 1990).

1.2 A Role for Human Performance Models

Terms like "system" or "model" are catch-all descriptors, often used without a clear and simple definition. A mathematician's model is probably quite remote from a psychologist's or social scientist's model. To the mathematician, a model entails a formal calculus tested through computer simulation; to the psychologist or social scientist it is often a verbal-analytic description of behaviour virtually synonymous with the term "theory". Pew and Baron (1983) offer that

there is no useful distinction between models and theories. We assert that there is a continuum along which models vary that has loose verbal analogy and metaphor at one end and closed-form mathematical equations at the other, and that most models lie somewhere in-between (p. 664). Fitts' law may be placed in this continuum. As a mathematical expression, it emerged from the rigors of probability theory, yet when transplanted into the realm of psychomotor behaviour it becomes a metaphor.

The need for a reliable prediction model of movement time in computer input tasks is stronger today than ever before. Bit-mapped graphic displays have all but replaced character-mapped displays, and office and desktop metaphors are gaining in popularity over menus and command lines. Today's user interfaces often supplant cursor keys and function keys with a mouse and pull-down menus. As the human-machine link gets more direct, speed-accuracy models for human movement become closer to actions in human-computer dialogues. Design models, such as the Keystroke-Level Model, need to accommodate the current range of movement activities in computer input tasks. Fitts' law can fill that need.

This study endeavours to assess critically the current state of Fitts' law and to suggest ways in which future research and design may benefit from a rigorous and corrected adaptation of this powerful model. Newell and Card (1985) expand on the role for theoretical models in the design of human-computer interfaces:

Another way [for theory to participate] is through explicit computer program tools for the design. The theory is embodied in the tool itself, so that when the designer uses the tool, the effect of the theory comes through, whether he or she understands the theory or not (p. 223). Psychological theories and experiments, such as Fitts' index of difficulty . . . can shape the way a designer thinks about a problem. Analyses of the key constraints of a problem can point the way to fertile parts of the design space. Providing tools for thought is a more effective way of getting human engineering into the interface than running experiment comparisons between alternative designs. (p. 238)

Arguably though, conducting empirical experiments to validate models is the starting point. Putting the theory into tools comes later. When properly applied and integrated into tools, however, theories may indeed elicit new ways of thinking for designers.

1.3 Information Theory Foundation

Fitts' law is a model of human psychomotor behaviour based on Shannon's Theorem 17, a fundamental theorem of communication systems (Fitts, 1954; Shannon & Weaver, 1949). The realization of movement in Fitts' model is analogous to the transmission of "information" in electronic systems. Movements are assigned an index of difficulty, in "bits", and in carrying out a movement task the human motor system is said to transmit so many "bits of information". If the number of bits is divided by the time to move, then a rate of transmission in "bits per second" can be ascribed.

Fitts' idea was novel for two reasons: first, it suggested that the difficulty of a motor task could be measured using the information metric "bits"; and second, it introduced the idea that in human movement, information is transmitted through a channel – a human channel. With respect to electronic communication systems, the concept of a channel is straight forward: A signal is transmitted through a non-ideal medium (such as copper, air, or glass) and is perturbed by noise. The effect of the noise is to reduce the information capacity of the channel from its theoretical maximum. Shannon's Theorem 17 expresses the effective information capacity C (in bits/s) of a communications channel of bandwidth B (in s-1 or Hz) as

C = B log2((S + N) / N)

or in the form

C = B log2(S / N + 1) (1)

where S is the signal power and N is the noise power (Shannon & Weaver, 1949, pp. 100-103).

The notions of "channel" and "channel capacity" are not as straightforward in the domain of human performance. The problem lies in the measurement of human channel capacity. Although electronic communication systems transmit information with specific and optimized codes, this is not true of human "channels". Human coding is ill-defined, personal, and often irrational or unpredictable. Optimization is dynamic, intuitive. Cognitive strategies emerge in everyday tasks through "chunking" which is analogous to "coding" in information theory – the mapping of a diverse pattern (or complex behaviour) into a simple pattern (or behaviour). Neuromuscular coding emerges through the interaction of nerve, muscle, and limb groups during the acquisition and repetition of skilled behaviour. Difficulties in identifying and measuring cognitive and neuromuscular factors confound the measurement of the human channel capacity, causing tremendous variation to surface in different experiments seeking to investigate similar processes.

1.4 Equation by Parts

Fitts sought to establish the information capacity of the human motor system. This capacity, which he called the index of performance or IP, is analogous to channel capacity C in Shannon's theorem. IP is calculated by dividing a motor task's index of difficulty, ID, by the movement time, MT, to complete a motor task. Thus,

IP = ID / MT. (2)

Equation 2 matches Equation 1 directly, with IP corresponding to C (in bits/s), ID corresponding to the log term in Equation 1 (in bits), and MT corresponding to 1/B (in seconds).

Fitts claimed that electronic signals are analogous to movement distances or amplitudes (A) and that noise is analogous to the tolerance or width (W) of the region within which a move terminates. Loosely based on Shannon's logarithmic expression, the following was offered as the index of difficulty for a motor task:

ID = log2(2A / W). (3)

Since A and W are both distances, their ratio within the logarithm is without units. The notion of "bits" as the unit of task difficulty stems from the somewhat arbitrary choice of base "two" for the logarithm. (Had base "ten" been used, the units would be "digits".)

A useful variation of Equation 2 places movement time on the left as the predicted variable:

MT = ID / IP. (4)

This relationship is tested by devising a series of movement tasks with ID (that is, A and W) as the controlled variable and MT as the dependent variable. In an experimental setting, subjects are required to move to and acquire targets of width W at a distance A as quickly and accurately as possible. ("Accurate", for the moment, implies a small but consistent error rate.) Several levels are provided for each of A and W, yielding a range of task difficulties.

The index of performance (IP) can be calculated directly using Equation 2 by dividing a task's index of difficulty by the observed movement time (averaged over a block of trials), or it can be determined by regressing MT on ID. In the latter case, the regression line equation is

MT = a + b ID (5)

where a and b are regression coefficients. The reciprocal of the slope coefficient, 1 / b, corresponds to IP in Equation 4, obtained through direct calculation. These two values for IP will be slightly different due to the different methods of calculation.

The intercept coefficient, a, is sometimes viewed as an error term. A non-zero intercept is troublesome since it suggests that a movement task with "zero difficulty" has a non-zero predicted completion time. The usual form of Fitts' law is Equation 5 expanded as follows:

MT = a + b log2(2A / W). (6)

The factor "2" in the logarithm was added by Fitts as an arbitrary adjustment to ensure that ID was greater than zero for the range of experimental conditions employed in his experiments (Fitts, 1954, p. 388). The "2" increases the index of difficulty by 1 bit for each task but has no effect on the MT-ID correlation or on the slope of the regression line equation. The constant "1" in Shannon's original equation was omitted by Fitts without justification. We shall have more to say about the intercept and slope coefficients and the form of the logarithm term in the next chapter.