Stalking the Wild E-print: A Scout's Impressions of Publicatia Incognita

Christopher D. Green
York University

christo@yorku.ca

2002 by Christopher D. Green
First posted
19 November 2002
Revised version posted
25 November 2002


My specialty is not communications but, rather, psychology -- specifically computational models of cognition and the history of psychology. As a result, my concerns may not be your concerns exactly, but this early in the game of figuring out what the implications of the new electronic media are for scholarly publication, I doubt we have yet traveled very far apart. Most of what I will talk about today is very basic, and in as little five years what I have to say will, no doubt, seem either hopelessly naive or utterly misguided. This prospect doesn't daunt me much because one of the significant advantages of electronic media (though it is occasionally seen as a drawback as well) is that most anything can be deleted with the click of a button. This will make it much easier for me to say "What conference presentation?" if I am quizzed about my current views by some much-smarter-than-I PhD student early in the 2010s.

I was asked to speak on this panel as a faculty member who has been "in the trenches," so to speak, managing e-print archives and other related websites. So, here's my story. Back in the autumn of 1995, a friend told me about the World Wide Web, and spent part of an evening teaching me how to code web documents in html. After a couple of weeks of being told by the tech folks at my school that there was no way for mere faculty members to post websites to the university's computer, I finally learned the requisite magic words and was allowed to pass, at least virtually, beyond the gates of mortal man and on to the hallowed domain of the Web Server.

I began setting up websites for several of the various scholarly societies to which I belonged. Most of my requests to society executives to do this were met with something along the lines of, "Uh, What? Yeah, sure. What is it exactly again?" I proceeded as though this had been a "yes," and within a few months had designed and was managing basic websites for five different organizations. This may sound like a lot, but the website designs themselves really did not cost me that much time. What took far more time than I expected -- and continues to do so to this day -- was the amount of time necessary for maintaining them, mostly updating material. Calls for papers go up, only to be replaced by conference programs, then updated conference programs, etc. Membership lists go up, only to be updated on a regular basis as new members join and older ones retire. Executive committees come and go. I decide it might be nice to include on the site a list of representative publications of the society's members, only to have it turn into a struggle between my trying to keep it to a manageable size that will help promote the society's work and a few persistent members trying to force me to include every shopping list they have ever written. Thus, we come to:

Green's Discovery #1 -- Make sure you set aside have enough time and personnel to adequately maintain whatever sites you decide to put up. Make sure you take this into account when deciding what extra features to add to your site.

Sometime the following year, I realized that I could bypass those nasty, overpriced, usually unreliable commercial print shops that reside on most college campuses, feeding off professors' needs to get sets of disparate readings to their students, if only I could post those readings to a website to which my students had access. I faced an immediate problem, however: copyright. At this point, it appears that Providence herself stepped in to save a good idea that happened to contravene several sections of the criminal code: I learned that the American Psychological Association, which has long owned most of the best journals in psychology, did not bother to renew copyright back in the days when it was necessary to do so, and so everything contained in their journals up to the early 1960s is now in the public domain. For most professors this might not have been such a boon, the material being so old, but for me, teaching history of psychology and anxious to get my students to read some primary source material rather than just simplified, homogenized textbook accounts, it appeared as the Promised Land before me.

Thus, in December of 1997 was born the website for which I am now probably best known: Classics in the History of Psychology. Initially, I posted to the site a set of five articles of historical importance (to psychologists, anyway). I had thought, but didn't quite wholly believe, that other history of psychology teachers might find such as site useful too, especially people in parts of the world less favored in terms of library facilities than was I, being near the gigantic University of Toronto collection. My doubts about the potential of this project were put to rest the following fall, by which point the site was getting over 10,000 page hits per month. In the five years since then I have posted about 200 articles, chapters, and books to the site, and it now receives about 175,000 page hits per month from countries the world over. So we come to:

Green's Discovery #2: Be prepared for the possibility of there being orders of magnitude more use of your site than you expected. It is not only your members, but many times as many students, as well people from outside the discipline, and their students, who may be interested in what you have to offer. As economists know well, the so-called law of supply and demand is wrong. Supply, especially when cheap and easy to access, can create a thitherto unknown demand of its own. This has happened in virtually every domain into which the internet has expanded. Scholarship is no different, in this one respect anyway. It's not just about professional academics anymore. It's also about the general public having access to information that was, until now, virtually impossible for them to get their hands on.

Moving closer to the precise topic of this symposium, last September, I set up a site into which colleagues could deposit their own research reports, in electronic form, whether they had been previously published or not, called the "History & Theory of Psychology Eprint Archive," or "HTP Prints" for short. Thus far, the HTP Prints site has attracted only a few dozen documents, but it is already receiving about 7000 page hits per month. I anticipate a total of about 85,000 page hits in this, its first complete year. This will only grow as I convince more of my colleagues of the enormous advantages in "visibility" that electronic publication affords their work. But still:

Green's Discovery #3: You must make a systematic and sustained effort to convince your colleagues to "buy in" to electronic publication. Some think it is going to be hard to upload their documents. Good software and good on-line support can allay those fears. Some think their work is going to be plagiarized. They must be reassured that the risks are really no greater (relative to the corresponding advantages) than with traditional print media. Some are just plain resistant to change. With luck, this won't be as much a problem for you in communications, of all things, as it has been in other disciplines. Experience shows, however, that "hard" scientists usually adopt the new media immediately (indeed, the WWW was invented at the Swiss physics lab CERN). Social scientists are somewhat slower and more resistant. Humanists often act as though we're setting the house on fire.

I was moved to take the step of founding HTP Prints by the success of CogPrints, an e-print archive designed by the redoubtable Stevan Harnad to serve the Cognitive Science community, and by the release of the free software package called "Eprints" that enables one -- well, one who is well-versed in the arcana of unix operating systems -- to simply install (rather than having to program from scratch) a fully functional e-print server and then adjust colors, topic names, and a few other features to taste. This one-size-fits-all approach may seem implausible, but in reality it has worked astonishingly well for topics ranging from philosophy to physics.

One of the major advantages of the Eprints software package, though it is not immediately obvious, is that it automatically generates what is called "metadata" about each document deposited (e.g., author's name, title, abstract, subject classifications, etc.), and it does so in a manner that is compatible with the emerging standard developed by a group called the Open Archives Initiative (OAI, for short). The practical impact of this is that any e-print archive you set up using Eprints software is automatically interoperable with every other e-print archive in the world that also adheres to OAI format. So, people can set up internet services that do nothing but harvest metadata from OAI-compliant e-print archives all over the world and plug the metadata into a search engine. When you, the user, want to find out about what research has been done lately on, say, racist content being broadcast by independent cable television stations, you do not need to scour dozens of e-print archives dedicated to various disciplines and sub-disciplines. Instead, you need only go to one metadata service provider, which has already scoured the metadata of the available e-print archives, enter your search parameters, and you get a hyperlinked list of all the documents in all the e-print archives that suit your needs. Such services are no longer a thing of the future. They are already beginning to pop up (see, e.g., citebase.eprints.org).

This has led to very recent and somewhat counterintuitive trend promoted by OAI. If one can get one-stop-shopping from a single metadata service provider, then there is no longer a real need to keep documents grouped together in separate archives according to discipline. Why not, instead, have e-print archives set up at individual institutions, regardless of discipline? In other words, instead of having all the physics documents on one e-print archive, all the chemistry documents on a second one, and all the biology documents on a third, why not have all the documents from, say, Harvard on one e-print archive, regardless of whether they pertain to physics, chemistry, or biology, all the Yale documents on a second, and all the Princeton ones on a third? As long as their topics are suitably represented in the metadata, it doesn't matter a whit where they actually reside (and other things being equal, their residing nearer to the author is probably slightly better, lest technical matters pertaining to the server itself need to be discussed face-to-face).

The inclusion of e-print archives, along with traditional journals and books, is, by and large, a great advance over journals and books alone. Paper that contain good research, but occupy an intellectual niche too narrow to justify the expense of traditional journal publication can now be disseminated worldwide at nominal cost. Documents that do not abide by the usual parameters of journals articles but are useful nevertheless can now be published as well. This applies not only to articles that are too short or too long to act as journal articles, but also documents that are not like journal articles at all, but important nevertheless. For instance, one of the documents on my HTP Prints is an annotated bibliography of every history of psychology textbook published in the English language since the 18th century. No journal would ever publish such a document, but it is an enormously useful resource to those of us in history of psychology.

Still, there is something about e-print servers -- at least as they are now being used -- which has begun to trouble me of late. Apart from a few bells and whistles, they typically amount to wielding an enormously flexible and dynamic medium as though it were little more than a big magazine. Using the internet to simply mimic formats first developed under the constraints of the printing press and the mail system is a bit like using the movie camera to record nothing but images of printed pages. It is very easy to become so conditioned to the conventional formats of scholarly research -- formats that were not develop in the first place so much to suit the research as to suit the technology by which the research was distributed -- that when new technology, even a technological revolution, comes along, we slavishly attempt to reproduce the old formats in the new technology, more or less blind to the new possibilities that the new technology affords.

The variety of creative new ways in which scholarly research can be produced and disseminated in the new electronic media is immense. We will have to work with it for a number of decades before the range of new "standard" formats is developed. Allow me to describe a project I am now working on that, I believe, will rival anything possible in the older media (scholarly or popular) both in terms of potential interest and in terms of scholarly rigor. I do not propose it necessarily as a model to be followed, but rather as an example of the kind of thing that can be done when one begins to think about scholarly research anew in light of the new electronic technology.

I have been conducting some historical research on a controversial hiring in the Department of Philosophy of the University of Toronto in 1889. Rather than writing a traditional journal article about the events surrounding the hiring, I have decided to make a video documentary. This is an enormous amount of work, but given the kind of digital video software that now comes as standard equipment on Macintoshes, it is not nearly as difficult to do as you might expect. The documentary will be pitched at a level suitable for an upper-level undergraduate class in the history of psychology, but it can also serve as an introduction to the topic for graduate students or professional colleagues. The documentary incorporates dozens of photos, drawings, and paintings of the players involved, the city in which they lived, the buildings and offices in which they worked, newspaper headlines about relevant events, etc. few of which that could be included in a traditional journal article. It also includes voice-overs of particularly pertinent passages from their writings, both public and private, excerpts from videotaped interviews with other experts on the topic, and a narration that carries on throughout.

Normally a video such as this might be of interest as a pedagogical tool, but not as a serious work of scholarship because the scholarly apparatus of footnotes, references, etc. is not included. No one else could check my sources, reinterpret them, and all that other stuff we make our careers out of. My intention, however, is to make it as scholarly as anything found in a standard academic journal, indeed more so. I am going to accomplish this by embedding the documentary in a web site. In one window of the website the video will play. In another window, whenever I would normally include a citation to support a claim or quotation, the full reference will appear. If one wishes to check the reference immediately (and I've been able to obtain permission to reproduce it) one can simply pause the video, click on the reference, and the full text will appear in a new window. If instead one wishes to simply note it for future reference, with the click of a button one will be able to add it to a customized list of references to be checked later, and continue watching the video. (Of course, there will, in addition, be available a full reference list to be perused at one's leisure.) The big advantage here, in case it went by so fast that you didn't catch it, is that I won't merely have references to obscure historical documents but, in addition, full transcriptions of those sources (and perhaps facsimiles of the originals too) for users to examine in detail rather than having to seek out and travel to a library that happens to have the same sources. In addition, there may be several secondary documents (or perhaps subordinate audio or video recordings) pertaining to side issues that normally cannot be dealt with in much detail in a journal article or in a book -- e.g., What was this particular person's life and career trajectory like, apart from their involvement in the events focused upon here? How did the issue of, say, religious politics play out more broadly in this community, apart from their impact on this particular set of events? These supplementary works might be authored by me as well. Possibly, however, I will be able to enlist colleagues of mine to write essays or make recordings on particular areas in which they are more expert than I.

The result is a kind "scholarly complex" that blurs and transcends many of the academic distinctions we have made in the past because of constraints on the media that were available to us. One person -- say an undergraduate -- might simply watch the documentary and take away my particular analysis and interpretation. Or someone -- say a graduate student -- might "push" down into the more detailed secondary and the published primary documents available on the site. Or, if someone -- say, a professional scholar -- wanted to do further research in this area, he or she might "dig" down even further into the previously unpublished primary documents that are included on the site. Naturally these documents need not all be textual; there could be an array of primary photographic, video, and auditory materials available as well.

So you see, the work transcends the textbook/monograph distinction. Its scholarly "level" can be customized to the level of the user -- both his or her education level and interest level. It is neither purely a journal article, nor a monograph, nor a collection of chapters, nor a collection of reprints. It has aspects of all of these but is still "other." It is neither textual nor graphical -- it is both of these and they are integrated in ways that transcend either alone. In addition, it is organic in that I can continue to add to it and modify it as I learn more or change my mind. It needn't ever be in a fixed and final version. I could even have users make contributions.

The whole business could then be posted to a website, its metadata harvested by a metadata service provider, and anyone interested in these events -- whether psychologist, philosopher, historian, or whathaveyou -- could access the site and use it to whatever degree they deem appropriate for their needs. If someone wanted to download a high quality version of the video and burn it on to their own DVD so it can be shown to a class, it could be set up to allow that as well. One, of course, has to be somewhat careful that anything that can be downloaded has the necessary copyright notices, but the chances of "stealing" the material here are really no greater than they are for videos that are available for rent. Of course, rented videos are pirated all the time, but the difference here -- as Stevan Harnad has to point out to his critics about once per hour -- is that academics for the most part work for the credit, not for the money, their research brings them. If people want to copy my video, credits intact, to show to their class, I encourage them, to do so. Scholarly knowledge is to be spread around. So,

Green's Discovery #4: Don't slavishly replicate the formats with which you are already familiar in a new medium that has infinitely greater possibilities. We cannot start entirely anew, of course. We live in our own histories. But we can still make an effort to break out of old constraints that no longer apply and to discover new ways to make our research broader, deeper, and more accessible. The internet brings with it all kinds of possibilities we haven't even thought of yet.

So, I hope I haven't bored you too much with my own personal history and my fantasies for the future. This is what I have discovered as I have scouted through parts of this new "Publicatia Incognita," as I have called it in the title. I have no doubt that those with more money and courage than I will delve into it more deeply and make a more systematic survey of it as time goes on. It is still early days, however, and I hope a few "independents" like me can still make a contribution.


Addendum

During the discussion after the symposium in which the present paper was included, the President of NCA announced that an agreement had been reached with a major commercial publishing company to produce the various NCA journals. When asked about whether an e-print server was part of the negotiation, he said that he hoped it would come in time, but that the topic had not been a major part of the negotiations. I suggested that time was a luxury that scholarly associations and commercial publishers no longer had because it had become so easy to found an e-print archive that any member of the audience could do it in the next few days. Having the endorsement of NCA would be desirable, but by no means necessary because an editorial board of established figures could be assembled in short order as well. I then put forward the possibility of print journals becoming a kind of "junior circuit" in future -- used primarily by graduate students and junior faculty to establish their credibility in the eyes of behind-the-times hiring and tenure committees -- while the more creative among the tenured faculty turned away from the limitations of traditional journals toward not just e-print archives, but the whole range of possibilities the internet affords for new forms of scholarly research (discussed briefly above).