Christopher D. Green
Department of Psychology
(1994) Canadian Artifical Intelligence, 35, 33-36
It goes without saying that connectionism is big in cognitive science these days. In fact, it is so big, that there are many--seemingly, most of the members of the Cognitive Science Society--who think it just is cognitive science; that pretty well everything that goes by the names of experimental psychology, philosophy of mind, linguistics, and ("classical") AI is just so much alchemy, phlogiston and... well, pick your favorite dead science.
It took connectionism a while to catch on, getting its start in the 1940s, going through some ups and down in the 1960s. After several technical innovations, the flame began to burn again in the early 1980s, bursting into a full-blown inferno soon after the publication of Rumelhart and McClelland's Parallel distributed processing in 1986. It is hard to believe that was less than a decade ago. Academic conferences, scholarly journals, and whole university departments have since sprung up dedicated to the proposition that connectionism constitutes the "proper treatment," to borrow Smolensky's famous phrase, of cognitive science.
The problem has been that, until very recently, much of this work has carried with it a "Gee whiz!" tone; the feeling that doing connectionist research is just so dog gone much fun that it would be a shame to spoil it with sharp questions or incisive criticism. Although it is necessary, as philosopher of science Imre Lakato said, to give new scientific research programs "breathing space," the grandness of the claims emanating from connectionist cognitive scientists rose to such dizzying heights in such short order that the old guard quickly came to the point where they felt it necessary to snap back, and hard. The best known of these counter-offensives was Fodor and Pylyshyn's (1988) "Connectionism and cognitive architecture," in which the authors argued that connectionist systems can only correctly model language (and thought itself) to the degree that they implement a traditional symbol processing system, and to the degree that they do that, the implemented symbol-processing model is the cognitively interesting level of analysis. The Fodor and Pylyshyn paper "threw a scare into the field" as one connectionist (Chalmers, 1991) put it. Although many defenses of connectionism quickly followed, the celebratory tone ebbed somewhat, to be replaced by more sober and sophisticated thinking, not just about the networks themselves (the leading connectionists had always been technical wizards) but, more importantly, about the philosophical and theoretical issues involved. The real battle had just begun.
Andy Clark's latest book, Associative Engines: Connectionism, concepts, and representational change (1993), is from this second generation of connectionist work. It is thoughtful and intelligent in its attempts to grapple with some of the central problems of cognition, more particularly with the significant criticisms that have been sent connectionism's way since its arrival as a major player in cognitive science. His command of the connectionist literature is masterful. He includes a great deal of recent work--some very recent work--emanating from sources such as "hot" new journals like Connection Science, unpublished technical reports, and even Internet discussions. Perhaps even more importantly, his command of the literature critical of connectionism is masterful as well. Straw men are few, though some do turn up to advance the narrative flow.
This does not mean that the book is not partisan. Despite Clark's claim to being "ecumenical" at the outset, the overall thrust of the book is that connectionism has (or soon will) overcome most every gauntlet that has been thrown in its path, and those difficulties that it has failed to redress have not yet been solved by the "classical" (i.e., serialist, symbol-processing) approach either. Jerry Fodor, not surprisingly, plays the part of Grendel in Clark's story but, as Clark is far more considered than was Beowulf, Fodor is depicted more even-handedly and with more respect, even if it is clear from the start that he is destined to fail in the epic struggle that ensues. There is mention of a "Copernican revolution" in cognitive science, and many points where opponents' view of the connectionist Truth is obscured by their "infection" (as he calls it at one point) by the classical paradigm. In short, the book is a spirited, but valuable and intelligent, defense of connectionist cognitive science.
Associative engines is divided into two parts, entitled "Melting the inner code" and "From code to process," respectively. The first section opens with a chapter that sets out the main distinctions between classical ("anti-developmental," "text-based," "folk-psychological") and connectionist ("developmental," "process-based," "non-folk-psychological") orientations. The most interesting feature is Clark's rejection of the common assumption--advanced mainly by Stich and the Churchlands--that connectionism leads naturally to the elimination of the whole folk-psychological vocabulary (beliefs, desires, etc.). He argues that, "the legitimacy and the value of the folk solids [e.g., concepts, propositions, attitudes] are independent of the truth of the [Fodorian] Syntactic Image as an empirical model of mind" (p. 8). Thus, he has carved out an interesting position for himself, abandoning folk psychology as the basis of cognitive theory, but pulling up short of the precipice of eliminativism.
Chapter 2 is an unabashed advertisement for the "unique selling point(s)" of connectionism. These include the superposition and distribution of representations which, it is argued, result in a natural account of the semantic relatedness of representations. The use of semantic is somewhat idiosyncratic here. What Clark has in mind is that, in a network representing the shapes of letters, the pattern of activation representing "E" will be more similar to that for "F" than to that for "C" because they have more physical features in common. Interestingly, this leads one to think that one is about to get an account of how to interpret units in a distributed representation (surely, in the above example, the representations are distributed with respect to letters, but not with respect to the component line segments of letters, an ambiguity in "distributed representation" that has long bothered me). This is not to be, however. Uninterpretability of individual units turns out to be an almost necessary feature of successful connectionist models. In any case, the argument continues that connectionist virtues such as "automatic prototype extraction," "generalization" (to new cases), "flexibility," and, most importantly, "context sensitivity" are among the benefits conferred by distributed representation.
The last of these gets special attention, as it has been often been put to connectionists that, below the level of the global representation, the individual units must be constitutive of local representations (albeit ones that are difficult to interpret) and, thus, the difference between connectionist and classical systems is more apparent than real. The response seems to be that as one travels down from the level of the representation to the level of the units, one finds more context sensitivity rather than less; that, indeed, normal concepts such as COFFEE come at the end, of the process, only as a sort of amalgamation of individual coffee-contexts (and, in turn, their derivative contexts), not as the starting point from which the contexts are developed.
Chapters 3 and 4 address, respectively, the questions of what connectionist network can and cannot be said to "know." It is here that the account of what such a network might be said to explain, from a scientific perspective, is addressed. This is an important question for many critics who have charged that even if a network could simulate all human behavior, it would be of, at best, limited scientific use as an explanation of human behavior. Clark argues for an inversion of the traditional explanatory process. The classical move, he explains, is to begin with a high-level description that specify the kinds abstract principles that generate the surface phenomena (Chomskyan linguistics is taken as the exemplar), and then to try to discover what kinds of increasingly concrete mechanisms (e.g., computational, then physiological, then molecular, etc.) could carry out the abstract ones. The connectionist vision is said to reverse this process: attend only to the surface phenomena of the domain itself (assume no underlying abstract principles), and then (dropping immediately to the "bottom" level) try to get a connectionist network to simulate them. Finally, through multivariate statistical processes such as cluster analysis and principal components analysis (PCA), work your way back "up" try to find out what abstract principles describe the networks performance. As intriguing as this account of explanation is, one is forced to wonder, however, why one would assume, a priori, that the "bottom" level is must be connectionist if one were not already a partisan. Is it now to be considered a methodological failing not to assume so?
Clark offers a number of important caveats about not treating the statistical analyses as being the "real" explanation, primarily because, useful as the may be, they fail to capture a number of important facts concerning the dynamic nature of the systems under study. This is all true, and a number of critics of connectionism have stumbled on this point in the past (although, he doesn't discuss Hinton's unsupervised learning networks which, as I understand them, actually do cluster analysis and PCA in order to produce their representations). Chapter 4 also includes discussions of Cussins notion of "non-conceptual content," and the relevance Karmiloff-Smith's studies of the developmental aspects of picture-drawing to the development of expertise in connectionist systems.
Chapter 5, the last of Part I, goes over the some fairly old psychological material on concepts, categories, and prototypes. Whereas the rest of the book is notable for the recency of the material it contains, this one is something of a disappointment. Clark's fluency with the psychological literature does not match that he has with the connectionist. He calls prototype theory, "an increasingly popular account," whereas it seems that prototype theory in psychology is on the wane after having been very prominent from the mid-1970s through to the mid-1980s. What is true in Clark's account is that connectionist networks have a natural affinity with theories of cognition in which prototypes (for all their problems) figure prominently, and have, thus, been picked up by a number of connectionist researchers. If they can revive the area--and I fear Clark's arguments here are at their weakest, though still worth considering--then more power to them.
In the second part of the book Clark explores a series of applications of recent connectionist research to long-standing problems in connectionism and, as the section progresses, in cognitive science more generally. In chapter 6 he critically examines the notions compositionality (i.e., the fact that what sentences mean is a function of what their component words mean) and explicitness of representation. Following van Gelder, and using examples from research by Smolensky and by Chalmers, he argues that systems can behave in ways that are functionally compositional even if they do not obviously concatenate symbols (as the sentences of language do). Then, following Kirsh, he argues that the explicitness of a representation is in the eye of the beholder--i.e., what may not seem explicit to us (e.g., the structure of representations in a connectionist network) may be perfectly obvious to a system designed to extract information stored in such a way (viz., the network itself). Transferring such knowledge to new domains has been a problem for connectionists--with context-sensitivity comes context-dependence, but even here, Clark argues, "progress has been made," namely in the form of self-modularizing systems.
Chapter 7 explores explicitly the problems that several researchers have encountered getting their networks to generalize to new inputs. Of particular interest here is Elman's attempts to teach a net the grammar of natural language. The featured solution to such problems is "phased training," in which the training schedule of the network is customized such that it "sees" only the easiest, or most "central," cases first, followed by a regime of gradual increase in the level of difficulty. This keeps the network from "search[ing] wildly" as Clark puts it, and getting, "lost in space(s)." All this is said to lead to a "forced union" (p. 145) between connectionism and developmental psychology. It seems far from clear, however, that children are given "phased training" in language. They hear all kinds of language all the time, and the current evidence is that the presence of simple "care-giver language" (a.k.a. "motherese")--which I take to be the analog to the first phase of training--has no impact at all on the speed or ease with which children learn language.
The chapter closes with an explicit response to the famous Fodor-Pylyshyn systematicity argument. Why, asks Clark, should we assume that minds are preset to be systematic in their thinking just because they ultimately turn out that way? "Instead of treating [systematicity] as a property to be directly induced by a canny choice of basic architecture," he argues, "it may be fruitful to try to treat it as intrinsic to the knowledge we want a system to acquire" (p. 148). And later, "although systematicity (in mature, adult human thought) is indeed pervasive. it need not be traced directly to the nature of the underlying cognitive architecture. Instead, it may be fruitful to try thinking of systematicity as a knowledge-driven achievement" (p. 150).
Chapters 8 focuses on a series of simulations in which stage-wise learning behavior, reminiscent of that exhibited by children, is exhibited as a natural property of the networks' learning procedure. Chapter 9 features the "artificial life" research of Nolfi and Parisi, showing how connectionist networks can escape the criticism of being helpless hostages of their input sets. Through simulated evolution, new networks "come into the world" with initial weight sets that, while falling far short of the strong nativism advocated by Fodor, are predisposed to deal effectively with their input "environments." Clark calls this endowment "minimal nativism."
Chapter 10, for my part is the most interesting. It is here, more than in any other part of the book, that Clark shows his estimable talents at Good Old Philosophy Of Mind (GOPhOM?). Here he returns to detail and defend the interesting claim made at the beginning of the book that one can be a connectionist without having to reject the vocabulary of Folk Psychology, wholesale. He begins with a review of the considerations that drove Stich to eliminativism. At the risk streamlining the story too much, Stich argues that if all our beliefs are stored in a superposed form (i.e., in a connectionist network), then no sense can be made of the claim that any one of them was responsible for a particular action. Since the activity of the whole net went into producing the output, all of our beliefs must be equally responsible. Thus, "belief" turns out to be a notion of little explanatory value. Skipping a lot of very interesting discussion in the interests of brevity, Clark's response, in essence, is that although beliefs are not, in any straightforward way, the causes of our actions, they still play a role in explanations of our behavior. By way of analogy, he points out that the claim, "the match lit because it was struck," does not describe the causal microstructure of combustion, but it does give us a good counterfactual-supporting explanation (i.e., if the match hadn't been struck, ceteris paribus, it wouldn't have lit). Consequently, the fact that individual beliefs are not isolable in the network has little consequence for their ontological status or explanatory value: "the folk explanations," Clark insists, "simply occupy a different arena" (p. 207). If this sounds suspiciously similar to the position Gilbert Ryle advanced in the 1940s, it is, and Clark admits it is. But then this is not surprising, given that he explicitly aligns himself (almost) with Dan Dennett, who (I believe) was a student of Ryle's. Where Clark differs is where he, almost as an afterthought, declares that a True Believer (i.e., a being that can truly be said to have beliefs) must have the ability to evaluate the rationality of its past cognitive performance, and must have consciousness. About this he, unfortunately, says precious little else. Finally, there is a short chapter reviewing the main threads of argument woven throughout the book.
To wrap-up, this is a very good book on a number of counts. It reviews the best of the most recent work in connectionist cognitive science. It carries out this task in a vigorously partisan, but by no means blind or unthinking, way. It also does it in an entirely non-mathematical way (this will, of course, be an advantage for some, a frustration for others). It responds to the major criticisms of the connectionist paradigm with interesting, informed, well-argued points. It show how one might accept connectionism without having to embrace eliminativism in one and the same gesture. It is appropriate for a wide range of audience: those professional cognitive scientists actively engaged in the battle and eager to hear the "latest word" from the connectionist camp, as well as those wanting to do a little "catching up." Graduate students who are just getting their first serious exposure to the promise connectionist research in cognitive science will benefit from it as well. This book will go a long way.
Chalmers, D. J. (1991). Why Fodor and Pylyshyn were wrong: The simplest refutation. Proceedings of the Twelfth Annual Conference of the Cognitive Science Society. 340-347.
Clark, A. (1993). Associative engines: Connectionism, concepts, and representational change. Cambridge, MA: MIT Press.
Fodor, J. A. & Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. In S. Pinker & J. Mehler (Eds.), Connections and symbols (pp. 3-71). Cambridge, MA: MIT Press.