Recent Trends in Operant Conditioning Research Paper

This sample Recent Trends in Operant Conditioning Research Paper is published for educational and informational purposes only. If you need help writing your assignment, please use our research paper writing service and buy a paper on any topic at affordable price. Also check our tips on how to write a research paper, see the lists of psychology research paper topics, and browse research paper examples.

Our starting point for this research-paper was a frequency analysis of the key words attached to articles published during the 10-year period from 1996-2006 in the Journal of the Experimental Analysis of Behavior, a standard source of literature on operant conditioning. The topics we take up here are those our analysis revealed as more readily definitive, though we do not present them in order of relative frequency. Taken together, we consider them a reasonable roadmap for navigating the present landscape of theory and research in operant conditioning. We will turn to the topics after a brief look at the basic elements of the foundational theory of operant conditioning and general methods for its study.

Basic Theory And Methods

The word theory rarely appeared among the key words. This absence may reflect a lingering ambivalence about the role of theory in operant conditioning, an ambivalence traceable to the work of the individual whose name appeared most frequently in our analysis—B. F. Skinner. His illustrious career spanned more than 60 years and saw him in several distinct roles as researcher, innovator, author, teacher, philosopher, social commentator, and, for some, visionary. He coined the term operant conditioning. No name is linked more canonically to operant conditioning than his. His first book, The Behavior of Organisms, was published in 1938 and remains unrivaled as a source of theory about operant conditioning (see the special issue of Journal of the Experimental Analysis of Behavior published in 1988 [Vol. 50, No. 2] that marked the 50th anniversary of The Behavior of Organisms). The book reports his ingenious research efforts to disentangle the operant form of behavior from the reflexive forms that others had previously identified and that constituted psychology’s preoccupation at the time. The research was conducted with rats. Later he introduced pigeons and eventually humans as research subjects. Indeed, these three categories of research subject were the most frequent entries in our analysis of key words, pigeons and rats appearing much more frequently than humans. The other most frequently appearing terms were the response forms that Skinner introduced for his animal subjects— key pecks and lever presses, respectively.

Though The Behavior of Organisms may be viewed as laying out a theory of operant conditioning, Skinner’s (1953) other most influential book, Science and Human Behavior, is skeptical about the role of theory in the science of behavior (see the special issue of the Journal of the Experimental Analysis of Behavior published in 2003 [Vol. 80, No. 3] that marked the 50th anniversary of Science and Human Behavior). Or at least he was skeptical about a certain type of theory—namely, one that uses one set of terms to postulate about the causes of behavior and a different set of terms to describe the actual behavior of interest. Skinner was particularly harsh in rejecting mentalistic theories of behavior—theories that referred to mind and thought, to will and agency, and so on. He considered theories with these fictional entities at their heart as frivolous, or worse, as unscientific impediments to a real science of behavior, a position he championed to the end of his life (see Skinner, 1990).

To make his own theoretical position clear, Skinner articulated the well-known three-term contingency of reinforcement. In addition to behavior (measured as discrete responses belonging to an operant class and often expressed in terms of the rate of responding—that is, the number of responses over an interval of time), he included the antecedents of responses (discriminative stimuli, the discrete stimuli that “set the occasion” for responses [in his phrase]) and the consequences of those responses (specifically, the stimuli that responses produce or that otherwise follow responses), which he termed reinforcers (and which are also often measured in terms of rate). For Skinner, stimuli acquire their discriminative function through conjunction with the relations between responses and reinforcers, as when reinforcers increase the rate of responding, and the absence of reinforcers decreases it. The science of behavior will proceed best by analyzing behavior at the level of discriminative stimuli, responses, and reinforcers and in terms of the contingent and noncontingent relations between them. In other words, Skinner advocated a functional analysis of behavior, an analysis of how it functions amid discriminative stimuli and reinforcers. His invention of schedules of reinforcement constituted an explicit means of analysis—methods for instantiating stimuli, responses, and consequences and for understanding their relations. A schedule is a collection of procedural steps whereby discriminative stimuli, responses, and reinforcers are presented in the course of an experimental session (see Ferster & Skinner, 1957, for an encyclopedic look at schedules and their effects on pigeons, rats, and humans).

Skinner’s commanding importance to operant conditioning is reflected in the regularity with which The Behavior of Organisms and Science and Human Behavior continue to be cited as the jumping-off place for research and theory. As a recent example, Catania (2005b; see also Killeen, 1988) returned to Skinner’s (1938) original concept of the reflex reserve, which he later renamed the operant reserve but eventually abandoned. According to Skinner, the operant reserve was a hypothetical store of responses that operated hydraulically. It rose when responses were reinforced and fell when they were emitted but not reinforced. The concept ran into trouble when faced with the so-called partial reinforcement effect, the demonstration that responses reinforced only intermittently (exemplified by schedules of reinforcement in which only some, not all, responses were followed by reinforcers) were more resistant to extinction. According to Catania, it is possible to save the concept of a reserve by extending the influence of a specific reinforcer not just to the reinforced response but also to the other, nonreinforced responses that preceded it. In this way, each response adds to the reserve, though not to the same degree, as well as depletes it. At issue is the extent to which nonreinforced responses increase the reserve. Catania considered various ways of expressing that contribution as a function of the time that elapsed between a response and the next occurrence of a reinforcer—the delay-of-reinforcement gradient. He also applied this view to several different types of reinforcement schedule, using computer simulations to suggest the type of performance the reserve would generate under each type.

Catania’s (2005b) article also exemplifies the growing use of mathematics to do more than merely summarize research results. He used mathematical functions to describe hypothetical processes that are responsible for the relations between responses and reinforcers. Mazur (2006) recently documented the increased use of mathematical models with a frequency analysis of the contents of Journal of the Experimental Analysis of Behavior. From 1960 through 2000, the percentage of articles that included mathematical equations increased approximately tenfold, from 3 to 30 percent. Reflecting on this growth, Mazur cited several benefits of the reliance on mathematical models in behavior analysis. Specifically, they lend improved precision over purely verbal descriptions of data and processes, thereby sharpening the differences in predictions between competing accounts of the behavior in question. In doing so they may call attention to factors that were overlooked and to the insufficiency of competing accounts, thereby stimulating their improvement. Mathematical models may also provide an integrative role, drawing together data from disparate sources in the interest of identifying a core set of theoretical concepts by which to examine and account for a growing diversity of behaviors. In addition, Mazur pointed to use of mathematical models for improved communication between researchers and between researchers and practitioners, including behavior therapists.

To illustrate the role of mathematical models in deciding between competing theoretical views, Mazur (2006) focused on the use of conditioned reinforcers in higher order schedules of reinforcement known as concurrent chains. Conditioned reinforcers may result from pairing originally neutral stimuli with biologically significant reinforcers such as food or water and are widely available in human experience. In concurrent-chains schedules, conditioned reinforcers may be colored lights that follow responses to the plastic discs that pigeons peck in order to produce food. Mazur presented three different models of behavior in concurrent-chains schedules. For each, he stated the prediction the model makes as a function of the delay to reinforcement (the same variable that Catania used in the operant reserve study)—in this case, the delay between the appearance of the conditioned reinforcer and the delivery of food. Future experiments designed to test the predictions may reveal the relative superiority of one model over the others and lead to subsequent revision of the latter or their abandonment in favor of the former.

Mazur’s reference to mathematical models that integrate diverse phenomena is exemplified by Killeen’s Mathematical Principles of Reinforcement (MPR), a theory of reinforcement schedules that combines principles of motivation, constraint, and association, each specified mathematically. It begins simply enough with reference to the behavioral arousal that reinforcers elicit and to the relation among arousal, rate of reinforcement, and rate of response. It then considers the way in which the minimal amount of time required to emit a response may constrain the rates of response and reinforcement when schedules of reinforcement are imposed. Finally, it makes room for a now-familiar variable: the delay between a response and a reinforcer (i.e., the extent to which they may be associated). In mathematical terms, MPR is expressed as follows:

b = cr/(b + 1/a)

where b stands for response rate and r for reinforcement rate, a is a measure of arousal level, 5 represents the average amount of time required to emit a response, and c is a measure of the association between response and reinforcement specific to the reinforcement schedule that is in place. Mazur cited the extension of MPR to a wide range of behavioral variables, including the effects of brain lesions and drugs.

In addition, Mazur (2006) cited the relevance of MPR to what has become the overarching issue in operant conditioning, namely, the level at which behavior analysis properly takes place. In The Behavior of Organisms, Skinner (1938) gave meticulous attention to specific properties of the rat’s response itself, including its form (topography), duration, and force, only to conclude eventually that rate of responding was the primary datum of operant conditioning. This distinction established the counterpoise between studying the effects of reinforcement at the momentary level at which responses occur (sometimes referred to as the local or molecular level) and its effects on responding extended over time (i.e., at the overall, global, or molar level). The ability of MPR to account for behavior at both levels is, for Mazur, one of its virtues. Another is the ability to account for behavior when it is first acquired as well as its occurrence following acquisition (i.e., in the steady state). Taken together, these virtues of MPR—predictive precision, integration of diverse phenomena, and extensibility to molecular and molar analyses and also to acquisition and steady state—may be considered the gold standard for theories of operant conditioning.

In the remainder of the chapter, we turn to several topics that represent the directions in which the study of operant conditioning has trended in recent years. In almost all cases, the topics are (a) instantiated in research that involves reinforcement schedule methodology and (b) expressible as mathematical models that can be anchored in one way or another to Skinner’s founding theory of the three-term contingency.

A Sampler Of Trends

What Is the Role of Reinforcers?

We previously made reference to the ubiquity of conditioned reinforcers in daily life and to the experimental study of conditioned reinforcement using concurrent chains schedules. A simple question about conditioned reinforcers is whether they have effects similar to those of the primary reinforcers they are associated with—a similarity implied by the joint use of the term reinforcer. For example, do the advertisements (one form of conditioned reinforcer) that appear in popular media affect our preferences in the same way as the actual products they portray affect them (i.e., by strengthening or weakening preference)? Or, alternatively, do conditioned reinforcers merely serve as placeholders for the primary reinforcers whose availability they signal? In other words, do reinforcers have a strengthening or a discriminative function?

Davison and Baum (2006) had previously studied “preference pulses” in pigeons using concurrent schedules of reinforcement in which they recorded every peck and every reinforcer. A preference pulse was identified by the persistence of responding to the same schedule following reinforcer delivery (i.e., preferring to remain under that schedule rather than switching to the other schedule that was available). A preference pulse grew larger as the number of responses under the same schedule increased before the subject switched to the other schedule. Davison and Baum used preference pulses to answer a simple question: do conditioned reinforcers produce preference pulses like those that food produces? In their experiment, the conditioned reinforcer was the presentation of a light that illuminated the location where food was delivered to the pigeons. Sometimes the light and the food occurred together. At other times, the light was presented by itself. Preference pulses reliably followed the light alone but were not as large as those produced by food-plus-light presentations, a result consistent with the view that conditioned reinforcers produce reinforcing effects, though not as strongly as primary reinforcers do.

In subsequent conditions, Davison and Baum (2006) pitted food against conditioned reinforcers by altering the degree to which the food ratio between the two schedules was the same as the ratio of conditioned reinforcers. In essence, this procedure allowed them to ask further whether the function of conditioned reinforcers is the same as that of primary reinforcers (i.e., they strengthen the responses that produce them) or whether their function instead is to signal the availability of primary reinforcers (i.e., they have a discriminative function). The results supported the latter view, thus calling into question the longstanding assumption that conditioned reinforcers function as reinforcers per se. Referring to the earlier example, advertisements may not affect our preferences for products so much as provide us with signals of the relative availability of those products: Their function may be that of a discriminative stimulus rather than a consequence. Davison and Baum ended their article with a potentially more provocative question: Do primary reinforcers also exert discriminative effects rather than reinforcing effects? The relevance of this question for the traditional view of the three-term contingency should be clear.

Behavioral Momentum And Conditioned Reinforcers

The preference pulses identified by Davison and Baum (2006) are related to the persistence of behavior following a reinforcer. A similar interest in the persistence of behavior (i.e., resistance to behavioral change) is found in the theory of behavioral momentum (Nevin, 1992; Nevin & Grace, 2000). According to the theory, reinforcers affect both response rate and resistance to change. Their effect on response rate reflects a response-reinforcer relation (the familiar strengthening relation). In contrast, their effect on resistance to extinction reflects a stimulus-reinforcer relation (i.e., a Pavlovian relation). Specifically, resistance is affected by the total rate of reinforcement that occurs in the presence of a stimulus. By analogy to physical mass, behavioral mass determines the extent to which changes in response rate (velocity) are affected by changes in reinforcement rate (force). The logic of experimental demonstrations of behavioral momentum depends on the introduction of “disruptors,” such as extinction, following the acquisition of stable performance under a standard reinforcement schedule.

A recent experiment by Shahan and Podlesnik (2005) asked whether behavioral momentum also applies to conditioned reinforcers. Pigeons emitted observing responses that produced the stimuli (conditioned reinforcers) associated with the various reinforcement schedules that could be in effect at any one time. Shahan and Podlesnik arranged alternating conditions in which observing responses produced a high rate of conditioned reinforcers or a low rate. They found that observing response rate was higher in the former condition (rich) than in the latter (lean). However, when extinction was used as a disruptor, resistance to change was essentially identical for both conditions. Among the possible explanations the authors offered was that “higher rates of stimulus [conditioned reinforcer] production may not have generated greater resistance to change because the stimuli have their effects on response rates through some mechanism other than strengthening by reinforcement” (pp. 15-16). This possibility aligns nicely with Davison and Baum’s (2006) conclusion.

Further Questions About The Three-Term Contingency: The Misbehavior Of Organisms And Superstition

Davison and Baum’s (2006) and Shahan and Podlesnik’s (2005) work raises questions about the adequacy of Skinner’s three-term contingency for the analysis of operant behavior. In a probing examination of its adequacy, Timberlake (2004) reflected on Skinner’s consummate skill as a shaper of new behavior in animals and the related work of two of Skinner’s students, Keller and Marian Breland (later Breland Bailey). With reference to Skinner’s successive (and successful) innovations for measuring rats’ behavior as outlined in The Behavior of Organisms, as well as subsequent clever demonstrations of shaping (rats lowering ball bearings down chimneys, pigeons bowling or playing ping pong), Timberlake pointed up Skinner’s ready eye for the present and potential behaviors in an animal’s repertoire—what Timberlake styled the “proto-elements” of contingency-shaped behavior. Equally impressive was Skinner’s eye for the physical modifications of apparatus that were necessary to produce the desired behavior—”tuning the apparatus.” His finesse was underscored by the Brelands’ subsequent discovery of instances of misbehavior that emerged unbidden when reinforcement contingencies were applied—behavior gone awry despite the apparently orthodox implementation of shaping techniques. For example, one project involved training a pig to deposit wooden coins in a piggy bank (K. Breland & M. Breland, 1961). When the trainers increased the response requirement, the pigs paused midroute and began to root the coins instead of carrying them to the bank. Such misbehavior persisted despite the loss of the reinforcers that awaited successful deposit.

For Timberlake (2004), successful implementations of the three-term contingency may mask the judicious conjunction of the elements that are operative, including the selection of the species whose behavior will be studied, the selection of the target behavior, the selection of the rein-forcer and discriminative stimuli, and the selection of the contingency itself. Because each of these factors is related to each of the others in possibly complex ways, simply aligning arbitrary discriminative stimuli with arbitrary responses and reinforcers in no way assures the occurrence of operant conditioning. Each aspect of the three-term contingency and the relations between them are fitting candidates for further research.

Much the same point was made by Williams (2001), who revisited Skinner’s (1948) classic demonstration of so-called superstitious behavior in pigeons. Food was presented to pigeons at regular intervals but was not contingent on behavior. Regardless of the absence of a contingency, Skinner reported that each pigeon soon exhibited its own ritualized behavior and concluded that these emergent behaviors testified to the selective action of the reinforcer. Williams also referred to the classic extension of Skinner’s experiment by Staddon and Simmelhag (1971), who found that the behaviors Skinner observed were actually components of a natural repertoire elicited by food. What appeared to have been operant conditioning on the surface turned out to be a case of Pavlovian stimulus-reinforcer relations.

Williams (2001) concluded, as did Timberlake (2004), that, just as each of the separable relations between contingent stimuli and responses in Pavlovian conditioning makes its own contribution to conditioning, so the separable relations between the elements of the three-term contingency must be evaluated for their differential contributions to operant conditioning.

Behavioral Economics: Temporal Discounting And Self-Control

Even though the requisite responses occur under a schedule of reinforcement, delivery of the contingent rein-forcers may be delayed (see Catania’s, 2005b, reference to the delay-of-reinforcement gradient). What a person prefers right now—say, ice cream—may no longer be preferred if the person has to wait a week for the ice cream or a day or even an hour. Many people, especially children, know the frustration of ordering something they want from an online catalog, only to have to wait for it to arrive in the mail. Being able to obtain it faster somewhere else, even if it is more expensive there, may be preferred to having to wait for it.

The study of choice, including choice between immediately available and delayed reinforcers, eventually prompted an intersection between operant conditioning research and experimental economics known as behavioral economics. The shared concepts and terminology include cost-benefit analysis, maximization of utility, marginal utility, budget, and discounting (present vs. future value), among others. Of particular interest has been the shape of the standard discounting function. Traditionally, economics has viewed the curve as falling off rather quickly then leveling off (i.e., discounting tends to be steeper with shorter waits). This phenomenon is referred to as exponential discounting in reference to the mathematical form of the discount function. However, operant research with humans and animals has shown that a different kind of function—the hyperbolic function—better describes the results of experiments in which individuals make repeated choices between options available now and those available later. The hyperbolic function typically falls more steeply than its exponential counterpart. In addition, hyperbolic functions more readily lend themselves to the reality of preference reversals, as occur, for example, when an individual sets the alarm clock earlier than usual, fully planning to take advantage of additional time that earlier rising will afford. The next morning, however, the folly of the previous night’s plan is all too obvious when the alarm clock buzzes; a few more minutes of sleep has a newfound deliciousness, it seems.

Green and Myerson (2004) summarized the experimental support for hyperbolic discounting in the case of choices involving delayed reinforcers as well as that in which the choice is between reinforcers with different odds or probabilities. For example, the discounting functions were steeper for children and young adults than for older adults. Among older adults, they were steeper for those with lower incomes than for those with higher incomes, where education level was held constant. Steeper discounting functions appeared for persons addicted to heroin and alcohol, as well as for smokers than nonsmokers. It is also the case that steeper discounting occurred in pathological gamblers than in only occasional gamblers or nongamblers. Although affirming the primacy of the hyperboloid function for describing temporal discounting, Green and Myerson conceded the real possibility that the processes of discounting for delayed as opposed to probabilistic outcomes may be fundamentally different—two processes rather than a single process—and pointed to the need for further research to resolve the matter.

Rachlin (2000) applied the concept of hyperbolic discounting to delayed outcomes and probabilistic outcomes using a theoretical framework he termed teleological behaviorism. On this view, the proper level of analysis is not individual responses or rates of response but patterns of responses extended in time. Moreover, reinforcer delay (measured as units of time) and reinforcer probability (measured as odds) have comparable effects on choice— reinforcers that come sooner are preferred to those that come later, just as those that more likely are preferred to those less likely. These preferences may obtain even when the later arriving or less likely reinforcers are more sizeable than their sooner arriving or more likely counterparts. By behaving in ways that prevent access to the sooner available or more likely alternative—that is, by engaging in what Rachlin called commitment—an individual foregoes the reinforcers that would otherwise prevail in favor of those that are richer. This achievement is self-control. It occurs in isolated instances (getting up an hour earlier than usual even on an especially cold morning), or it may become a pattern of behavior that occurs over longer stretches of time, eventually characterizing an individual lifestyle—for example, prudential life as opposed to profligate. Rachlin (2002) also has shown how teleological behaviorism can resolve the inherent contradictions of altruistic behavior.

Behavioral Ecology In The Operant Laboratory

Although methods of operant conditioning may be applied in naturalistic settings (see Baum, 1974a, for ideas on studying pigeons’ behavior in one’s broken-windowed attic), their typical provenance is a laboratory. However, even there it is possible to examine behavioral phenomena that have been identified previously by ecologists in the wild, including the behavior of groups of subjects. Baum and Kraft (1998) approximated natural patches of food by feeding a 30-member flock of pigeons in ways that constrained both the size of areas in which food was available and the delivery of food to the areas. They varied the travel distance between areas and also the overall rate of food delivery. In a lengthy experiment, they compared the individual allocation of behavior between areas by members of the flock to the aggregated allocation of the flock as a whole.

The analysis of operant behavioral allocation (i.e., of choice between alternative sources of reinforcement) had led to the formulation of the generalized matching law (Baum, 1974b; Davison & McCarthy, 1987). It states that, when two alternatives are available, the proportion of responses directed to one alternative equals the proportion of reinforcers provided by that alternative, subject to a pair of factors known as bias and sensitivity. Bias refers to the fact that the experimental subject may prefer one reinforcer consistently to another regardless of differences in frequency, amount, or other aspects. Sensitivity refers to the extent to which the subject discriminates the differences between reinforcers. Baum and Kraft (1998) asserted the mathematical correspondence between the generalized matching law and the ideal free distribution, a standard model in behavioral ecology for describing group foraging. In essence, it describes the equilibrium that results when individual foragers switch between food patches in order to optimize individual net gain.

Across the several conditions of their experiment, Baum and Kraft (1998) found that the members of the flock participated to different but consistent extents, perhaps reflecting individual differences in competitive ability. However, there was little if any consistency between individual subjects’ responses to shifts in food distribution between the areas, travel time between areas, or the total amount of food available. Instead, what emerged was a collective adherence to the ideal free distribution; it described well the behavior of the flock as an entity.

In a subsequent study, Kraft and Baum (2001) applied the ideal free distribution to the behavior of human subjects in a study analogous to that with pigeons. Instead of moving between food patches, each subject in a 13-person group displayed a colored card (blue or red) to which different numbers of points were assigned and received points based on the number of subjects who displayed the same color. For example, in the 80 to 40 condition, those displaying blue cards divided 80 points and those displaying red cards 40 points. In the 20 to 100 condition, those displaying blue cards divided 20 points and those displaying red cards 100 points. Across the successive trials in each condition, the distribution of cards displayed within the group tended to match the points ratio, thus achieving an equilibrium in which gains were approximately equal for each subject and in accordance with the ideal free distribution.

Animal Timing and Cognition

The experimental efforts to demonstrate similitude between the operant behavior of animals foraging in natural habitats and that of humans playing a card game in the laboratory reflect an interest in complex behaviors shared by humans and nonhumans alike, which include covert or private behaviors—behaviors observable only by the individuals to whom they belong. Timing behavior is one of them. Of course, humans have access to a variety of timekeeping devices but are still able to estimate time in their absence. The experimental analysis of animal timing was part of Skinner’s oeuvre. A recent review (Lejeune, Richelle, & Wearden, 2006) of operant methodologies and models of animal timing recognized Skinner’s role and his scrupulous avoidance of any reference to internal clocks or other mentalistic time-telling. Of particular moment were Skinner’s invention of the fixed-interval (FI) schedule of reinforcement and schedules for the differential reinforcement of response rate, which allowed Skinner and others to study temporal discriminations and additional phenomena related to temporal control (i.e., the control of behavior by intervals of time). More recent behavior-analytic accounts of animal timing have emphasized the role of behavior in mediating temporal discriminations. Lejeune et al. paid close attention to three models in particular: Killeen and Fetterman’s (1988) Behavioral Theory of Timing (BeT), Machado’s (1997) Learning to Time Model (LeT), and Dragoi, Staddon, Palmer, and Buhusi’s (2003) Adaptive Timer Model (ATM). The latter model distinguishes between reinforced behavior and all other behavior and assumes that the temporal control of behavior is reflected in the patterns of alternation between the two types of behavior over time. Alternations occur less frequently as rate of reinforcement declines—a process represented by a decay parameter in the ATM. Under FI schedules, for example, the relatively long pause following a reinforcer is attributed to sustained patterns of other behaviors that give way to the resumption of reinforced behavior in close advance of the next reinforcer.

A similar insistence on accounting for complex behavior in terms of behaviorally defined mechanisms characterizes the study of animal cognition (i.e., animals’ private or covert behaviors) within the operant conditioning literature: Any mechanism that is invoked should be readily susceptible to experimental demonstration. Work summarized by Zentall (2001) is illustrative of this process for assuring the parsimony of behavioral theories. Zentall and his students have used the procedure of delayed matching-to-sample to study pigeons’ “representation” (a cognitive term of reference) of visual stimuli. In the basic procedure, an arbitrary stimulus (the sample) is presented for a fixed duration. Later, two comparison stimuli are presented, one of which is the original stimulus. The pigeon’s task is to select (match) that stimulus from the pair. By manipulating the delay between presentation of the sample and the comparison stimuli, it is possible to produce a retention function, or a function that relates the probability of a correct match to the length of the delay.

Zentall (2001) further explained that the shape of the retention function can be useful in deciding the question of whether the subject’s representation is prospective or retrospective: Is the pigeon’s behavior during the delay controlled by the sample (retrospective representation) or by the correct member of the comparison pair (prospective representation)? One way to answer the question is to manipulate the type of stimulus that is used—different

hues as opposed to different line orientations—for the sample and comparisons in order to determine whether the retention functions differ in corresponding ways. When Zentall and his colleagues arranged such tests, they found conclusive evidence that the representation was retrospective. Varying the sample stimuli affected the retention function, but varying the comparison stimuli did not. Based on such findings, Zentall concluded,

The advantage of cognitive accounts is that they encourage us to examine behavioral phenomena in a new light and by so doing to conduct novel experiments that often lead to a better understanding of the behavior of organisms, whatever results are obtained. (p. 75)

Behavioral Pharmacology: Drugs As Discriminative Stimuli But Not Necessarily Reinforcers

Branch (2006) made a similar claim in a recent review of the contribution of behavioral pharmacology to the experimental analysis of behavior; behavioral pharmacology is the study of the effects of drugs on behavior. Branch suggested that drugs may serve as discriminative stimuli in the three-term contingency and, in that role, help elucidate the stimulus control of private or covert events. To make the point, Branch first recognized that people typically learn to describe private or covert events from the same people who teach them to describe public events—events mutually available to the individual and to others. For example, a parent who accompanies her or his child to the health clinic for the child’s vaccination may express concern at the child’s signs of distress after the injection is administered—”I can tell that the needle hurt you.” By generalizing from the private events that accompany the parent’s verbal behavior on such occasions, the child may discriminate future instances where the expression “That hurts!” is reinforced as a report of covert behavior. Reinforcement is especially likely to occur when the child’s self-report is accompanied reliably by overt indicators of pain otherwise.

Because animals do not report private events in the same verbal terms as humans do, experimenters must find alternative means for such reports to occur. Branch (2006) argued that behavioral pharmacologists do so when they administer a drug or a nondrug (such as saline solution) to subjects. Analysis of a subject’s overt performances in response to these two types of discriminative stimuli may show differences. All other things being equal, the overt differences can be taken as evidence of the differences in private events that accompany the administration of drug and nondrug, just as a child’s wincing and crying following a needle prick aver the self-report, “It hurts!” In fact, Branch proposed the use of procedures in which the presentation of public stimuli that accompany the administration of drug or nondrug is varied systematically. Doing so would allow the experimenter to estimate the relative robustness of private events versus public events.

In addition to serving as discriminative stimuli, Branch pointed out that drugs also reinforce behavior and cited a study by Donny et al. (2003), who trained rats to press a lever to deliver intravenous injection of nicotine. Using a standard procedure, they first trained rats to press one of two levers for food. Eventually lever presses produced food and nicotine. Presses on the second lever (the control lever) had no prearranged outcome. Food delivery was discontinued later. Rats continued to press the lever to deliver nicotine, albeit at a much lower rate than when food was available but still higher than pressing the control lever. When saline subsequently was substituted for nicotine, the rate of lever pressing decreased to become essentially the same as that on the control lever. At this point the results would seem to support the conclusion that nicotine reinforced lever pressing.

However, Donny et al. (2003) were not finished. As an important further control on whether nicotine is a reinforcer, they utilized a second group of rats, each of which was yoked to a subject in the first group—when a rat in the first group received an injection of nicotine, the yoked rat in the second group received an injection of nicotine simultaneously. That injection was not contingent on lever pressing, as it was for the rat in the first group. Nevertheless, the rate of lever pressing was nearly identical by rats in both groups. More importantly, lever pressing declined similarly in both groups when saline was substituted for nicotine and returned to near-original levels when nicotine was reintroduced. These surprising results call into question whether nicotine (and, by extension, other drugs) serves as a reinforcer or rather affects behavior in a different role. For example, nicotine may enhance some other aspect of the situation that is the real reinforcer therein. In any event, Branch (2006) concluded that research in behavioral pharmacology may illuminate issues for the study of behavior in general and not just the effects of drugs on behavior.

Behavioral Neurobiology: A Case Study in Attention-Deficit/Hyperactivity Disorder (ADHD)

The study of drug effects on behavior obviously opens the door to the study of those effects as correlated with co-occurring physiological events in the same subject. This possibility raises the question of the extent to which physiology should play a role in the experimental analysis of behavior. Recently, Timberlake, Schaal, and Steinmetz (2005) summarized Skinner’s view of the issue as follows: Neurobiology will eventually provide a unified account of behavior. That account will owe a considerable debt to the experimental analysis of behavior, which will be an invaluable guide to physiologists as they decipher nervous system events and their interrelations: Behavior analysis is propaedeutic to neurobiology, providing grist for the mill, as it were. Or to put it more tendentiously, neurobiology will proceed best by seeking the physiological underpinnings of the phenomena already identified by behavior analysts.

Skinner was not ready to turn behavior analysis over to neurophysiology, as he considered the latter insufficiently mature to tackle the complexities the former was revealing. Timberlake et al. (2006) expressed their impatience with Skinner’s insistence on waiting for a readily integrative neurobiology and argued instead that, where bridges between specific behavioral and neurobiological phenomena suggest themselves, they should be pursued. As a case in point, the expansive special issue of Journal of the Experimental Analysis of Behavior they edited was meant as a collection of prototypical bridges by which to link the two disciplines.

A theoretical article by Sagvolden, Johansen, Aase, and Russell (2005) provides a further illustration of the prospect. Sagvolden et al. first categorized the clinical symptoms of attention-deficit/hyperactivity disorder (ADHD), all of which are behavioral, with emphasis on the main symptoms of inattentiveness, overactivity, and impulsiveness. They then proceeded to articulate a developmental theory of ADHD that focuses (perhaps ironically) on hypoactivity specifically, lowered activity in three different branches of a brain system featuring the neuromodulator dopamine. Deficits in dopaminergic function alter the function of specific neurotransmitters in brain circuits the authors designate the prefrontal loop, the limbic loop, and the motor loop.

For example, in the limbic loop, brief increases in dopamine activity have been recorded following the presentation of primary and conditioned reinforcers. Eventually such increases are evoked by the stimuli that signal the response-reinforcer contingency (i.e., by discriminative stimuli). Conversely, brief decreases in dopamine activity have been observed when reinforcers are no longer presented following responses during extinction. In addition to explaining how such effects may be mediated at the cellular level, Sagvolden et al. (2005) also described the predictable behavioral effects of the hypofunctioning dopaminergic system. A weakened limbic response to reinforcers may be linked to the symptom of delay aversion or impulsiveness. In other words, the weakened limbic response results in steeper delay-of-reinforcement gradients for individuals diagnosed with ADHD. In such cases, Sagvolden et al. recommended that more robust reinforcers be required as part of therapy for such individuals. They also discussed the implications of their theory for the role that parents, social agencies, and medication play in remitting the symptoms of ADHD.

In an extended commentary on the Sagvolden et al. (2005) article, Catania (2005a) further developed the implications of altered delay-of-reinforcement gradients for the symptoms of ADHD. He argued that steeper gradients reflect the differential selection of rapid response rates (hyperactivity) and the weakened ability of conditioned reinforcers to maintain observing responses (attention deficit), as well as greater impulsiveness. An additional phenomenon may be at work, namely, extinction deficit, which he defined as the failure to observe stimuli that accompany extinction, thereby slowing its achievement. Catania recommended interventions that begin with deliberately short delays of reinforcement and the use of conditioned reinforcers in order to progressively extend the sequence of desirable behaviors that will persist when delays grow longer. He particularly advocated the use of computer games for this purpose.

Donahoe and Palmer (1994; see Donahoe, 2002, for an updated summary) described an alternative approach to the analysis of neurodevelopmentally mediated behavior, such as ADHD. Theirs is a Darwinian approach that draws heavily from the discipline of artificial intelligence and specifically the concept of neural networks. According to this view, the nervous systems of animals, including humans, may be characterized in terms of operational units that selectively process incoming signals and are, in turn, affected by the relative success of the behavior they effect. Donahoe and Burgos (2005) provided a commentary on the Sagvolden et al. (2005) article and on Catania’s (2005a) commentary in which they repeated the selectionist theme and applied a selection network simulation in analyzing the differential effects produced by varying the length of reinforcer delay. They showed that, with increasing delays, the decrease in the primary reinforcer’s effectiveness could be mitigated by the presentation of conditioned reinforcers during the delay interval—a process reminiscent of that recommended by Catania.

Signal Detection and Reinforcement History

Consideration of the symptoms of ADHD in the context of behavior analysis raises questions about the efficacy of stimuli—discriminative or reinforcing—in the control of behavior. With reference to Skinner’s previously cited aversion to cognitive theories, Wixted and Gaitan (2002) offered a middle ground. In their view, cognitive theories have generated research that is to be respected for revealing otherwise unsuspected human abilities. At the same time, cognitive theories may obscure the reinforcement contingencies of which behavior is a function by taking the place of individual reinforcement histories (i.e., by functioning as surrogates for those histories). The authors offer cognitive models, specifically signal-detection models of human recognition memory, as a case in point.

Signal-detection models include parameters for individual sensitivity and individual bias and thus have something in common with the generalized matching law referred to earlier. The models have wide applicability to decision tasks requiring the recognition of a specific event against a background of noise—for example, recognizing a specific face amid a sea of faces. In a typical recognition memory task, subjects first study a list of words then later decide which words on a test list were on the first list. Half of the items on the test list were included in the first list (targets); the other half were not (lures). Four outcomes are possible: The subject correctly recognizes a target (a “hit” in the parlance of signal detection theory); the subject fails to recognize a target (an “omission”); the subject identifies a lure as a target (a “false alarm”); or the subject correctly identifies a lure (a “correct rejection”).

Wixted and Gaitan (2002) considered a pair of models for recognition memory. According to one model (familiarity), decisions are based on the subject’s relative familiarity with the test items. The other model (likelihood ratio) stipulates that decisions occur on the basis of the odds ratio—the likelihood that an item was drawn from the targets as opposed to the lures. Consequently, it is the more complex of the two models. In further discussion, the authors showed that, as hit rate increases, the familiarity model predicts that the false alarm rate will remain constant. By contrast, the likelihood ratio model predicts that the false alarm rate will decrease and vice versa (the mirror effect). The latter prediction is strongly supported by the recognition memory literature, even though it requires considerably more sophistication on the part of the subject than the familiarity model does. Wixted and Gaitan reviewed additional findings in which the likelihood ratio made unique predictions that experimental results confirmed.

Then their analysis took a provocative turn. They asked whether the complex abilities of humans displayed in recognition memory tasks are biological endowments (i.e., part of a cognitive apparatus that defines humans by nature) or instead might be considered to result from a substantial, individual history of reinforcement for decision making. In an attempt to answer the question, they reviewed data from pigeon experiments involving recognition memory tasks akin to those used with human subjects. The procedure involved two kinds of trials—sample and no-sample. In the former, a sample stimulus appears, followed by a retention interval, then a choice between two test stimuli. A response to one of them produces the reinforcer. In no-sample trials, the trial begins with the retention interval and ends with a response to one of the two test stimuli. Here it is the other stimulus that is correct. In this way, the pigeon discriminates between sample and no-sample trials. Pigeons learn to perform the task with high accuracy and also produce the mirror effect, consistent with the likelihood ratio model. Wixted and Gaitan (2002) reviewed the results from experiments in which there was systematic manipulation of the retention interval across sessions or within sessions. In both cases, they observed a similar pattern of initial disruption of performance followed by a return to criterial performance predicted by the likelihood ratio model—a pattern they directly attributed to the subject’s history of reinforcement during the experiment. To drive the point home, the authors demonstrated the mathematical equivalence of the generalized matching law referred to in an earlier section and the likelihood ratio model.

Wixted and Gaitan’s (2002) conclusion is imposing: It is unnecessary to account for the results of the pigeon research in terms of cognitive mechanisms; reinforcement history suffices. By extension, similar accounts conceivably could be made for human recognition memory, were sufficient details of individual reinforcement history available. In the authors’ words,

Skinner (1977) was probably right when he asserted that “the mental apparatus studied by cognitive psychology is simply a rather crude version of contingencies of reinforcement and their effects” (p. 9). If so, much of what is interesting about human memory will be illuminated by studying animals whose reinforcement history can be studied in a much more direct way. (p. 305)

Summary

The tension between the experimental analysis of behavior and cognitive psychology may be largely rhetorical (it is a matter of differences in how one talks scientifically about the same data) but still runs deep. Tensions are also discernible within behavior analysis over the status of hypothetical, noncognitive mechanisms of behavior (see, for example, the Book Review and Commentaries section in the July, 2004, issue of Journal of the Experimental Analysis of Behavior). Reese (2001) is one of many commentators (see also Critchfield, 2006; Lattal & Chase, 2003; Roediger, 2004) who have argued for a measured approach to these and wider tensions.

The three-term contingency of reinforcement is the touchstone of operant conditioning. DeGrandpre (2000) demonstrated its relevance for postmodern perspectives, including deconstructionism and social constructionism, using drug abuse and its treatment as a case in point. Mattaini (2006) extended the purview of operant conditioning to larger-scale projects for lifting human suffering, including “domestic violence, terrorism, the environmental impact of consumer behavior, substance use, rehabilitation of prisoners, the science of nonviolence, human rights, child labor, racial discrimination, gambling, and HIV/AIDS, to name only a few” (p. 10).

A recent article by Killeen (2004) provides an integrative adieu for our chapter. He introduced stimulus discrimination theory (SDT), with its joint emphasis on sensitivity and bias, and its avatar, signal detection theory. Sensitivity refers to stimulus control of the response and bias to control of the response by the reinforcer; thus, SDT aligns with the three-term contingency of reinforcement. Killeen applied SDT to Skinner’s (1957) analysis of verbal behavior, specifically to the categories known as tacts (verbal behavior largely under the control of discriminative stimuli) and mands (verbal behavior largely under the control of reinforcers), and then took up the issue of causal attribution (i.e., judgments about which actions within a context are responsible for a specific consequence). In doing so, he showed that pigeons and humans assign causation similarly and that inferences about the behavioral effects of covert stimuli are premised on the observation of behavior when stimuli are overt. In other words, suppose behavior is observed to change when a new stimulus is added to a previously presented stimulus. A similar behavioral change occurs when drug administration coincides with the previously presented stimulus. Such behavioral correspondence—the same effects whether a new stimulus is presented or a drug is presented—becomes the warrant for inferring that the subjective effect of the drug is akin to adding the other stimulus.

In a tour-de-force, Killeen (2004) also showed that the three-term contingency of reinforcement can be fitted to Aristotle’s fourfold of causes—efficient (the discriminative stimulus), material (the response), final (the rein-forcer), and formal—the combination of discriminative stimulus, response, and reinforcer in the three-term contingency itself. But Killeen’s ultimate interest was to provide a framework for discriminating the causes of behavior on the gamut running from control by reinforcers to control by covert stimuli. What he offered, in the end, was a view of multiple prospective causes linked in a chain. To account scientifically for behavior, the task becomes discriminating the actual causes from the prospective, with reinforcers contingent on doing so. It is a task behavior analysts, of all creatures, can relish.

References:

Baum, W. M. (1974a). Choice in free-ranging wild pigeons. Science, 185, 78-79.
Baum, W. M. (1974b). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22, 231-242.
Baum, W. M., & Kraft, J. R. (1998). Group choice, competition, travel, and the ideal free distribution. Journal of the Experimental Analysis of Behavior, 69, 227-245.
Branch, M. N. (2006). How research in behavioral pharmacology informs behavioral science. Journal of the Experimental Analysis of Behavior, 85, 407-423.
Breland, K., & Breland, M. (1961). The misbehavior of organisms. American Psychologist, 16, 681-684.
Catania, A. C. (2005a). Attention-deficit/hyperactivity disorder (ADHD): Delay-of-reinforcement gradients and other behavioral mechanisms. Behavioral and Brain Sciences, 28, 419-424.
Catania, A. C. (2005b). The operant reserve: A computer simulation in (accelerated) real time. Behavioural Processes, 69, 257-278.
Critchfield, T. (2006). On the future of behavior analysis: Introduction to the special issue. Association for Behavior Analysis International Newsletter, 29(3), 1-3.
Davison, M., & Baum, W. M. (2006). Do conditioned reinforcers count? Journal of the Experimental Analysis of Behavior, 86, 269-283.
Davison, M., & McCarthy, D. (1987). The matching law: A research review. Hillsdale, NJ: Erlbaum.
DeGrandpre, R. J. (2000). A science of meaning: Can behaviorism bring meaning to psychological science? American Psychologist, 55, 721-739.
Donahoe, J. W. (2002). Behavior analysis and neuroscience. Behavioural Processes, 57, 241-259.
Donahoe, J. W., & Burgos, J. E. (2005). Selectionism: Complex outcomes from simple processes. Behavioral and Brain Sciences, 28, 429-430.
Donahoe, J. W., & Palmer, D. C. (1994). Learning and complex behavior. Boston: Allyn & Bacon.
Donny, E. C., Chaudhri, N., Caggiula, A. R., Evans-Martin, F. F., Booth, S., Gharib, M. A., et al. (2003). Operant responding for a visual reinforcer in rats is enhanced by noncontingent nicotine: Implications for nicotine self-administration and reinforcement. Psychopharmacology, 69, 68-76.
Dragoi, V., Staddon, J. E. R., Palmer, R. G., & Buhusi, C. V. (2003). Interval timing as an emergent learning property. Psychological Review, 110, 126-144.
Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. Englewood Cliffs, NJ: Prentice Hall.
Green, L., & Myerson, J. (2004). A discounting framework for choice with delayed and probabilistic rewards. Psychological Bulletin, 130, 769-792.
Killeen, P. R. (1988). The reflex reserve. Journal of the Experimental Analysis of Behavior, 50, 319-333.
Killeen, P. R. (1994). Mathematical principles of reinforcement. Behavioral and Brain Sciences, 17, 105-172.
Killeen, P. R. (2004). Minding behavior. Behavior and Philosophy, 32, 125-147.
Killeen, P. R., & Fetterman, J. G. (1988). A behavioral theory of timing. Psychological Review, 95, 274-295.
Killeen, P. R., & Sitomer, M. T. (2003). MPR. Behavioural Processes, 62, 49-64.
Kraft, J. R., & Baum, W. M. (2001). Group choice: The ideal free distribution of human social behavior. Journal of the Experimental Analysis of Behavior, 76, 21-42.
Lattal, K. A., & Chase, P. N. (Eds.). (2003). Behavior theory and philosophy. New York: Kluver Academic/Plenum.
Lejeune, H., Richelle, M., & Wearden, J. H. (2006). About Skinner and time: Behavior-analytic contributions to research on animal timing. Journal of the Experimental Analysis of Behavior, 85, 125-142.
Machado, A. (1997). Learning the temporal dynamics of behavior. Psychological Review, 104, 241-265.
Mattaini, M. (2006). Trends in social issues. Association for Behavior Analysis International Newsletter, 29(3), 10-11.
Mazur, J. E. (2006). Mathematical models and the experimental analysis of behavior. Journal of the Experimental Analysis of Behavior, 85, 275-291.
Nevin, J. A. (1992). An integrative model for the study of behavioral momentum. Journal of the Experimental Analysis of Behavior, 57, 301-316.
Nevin, J. A., & Grace, R. C. (2000). Behavioral momentum and the law of effect. Behavioral and Brain Sciences, 23, 73-130.
Rachlin, H. (2000). The science of self-control. Cambridge, MA:
Harvard University Press. Rachlin, H. (2002). Altruism and selfishness. Behavioral and Brain Sciences, 25, 239-296.
Reese, H. W. (2001). Review of The war between mentalism and behaviorism: On the accessibility of mental processes by William R. Uttal. Journal of the Experimental Analysis of
Behavior, 76, 115-130. Roediger, R. (2004). What happened to behaviorism? The APS Observer, 17(3), 5, 40-42.
Sagvolden, T., Johansen, E. B., Aase, H., & Russell, V. A. (2005). A dynamic developmental theory of attention-deficit/hyper-activity disorder (ADHD) predominantly hyperactive/impulsive and combined subtypes. Behavioral and Brain Sciences, 28, 397-468.
Shahan, T. A., & Podlesnik, C. A. (2005). Rate of conditioned reinforcement affects observing rate but not resistance to change. Journal of the Experimental Analysis of Behavior, 84, 1-17.
Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: Appleton-Century-Crofts. Skinner, B. F. (1948). Superstition in the pigeon. Journal of
Experimental Psychology, 38, 168-172. Skinner, B. F. (1953). Science and human behavior. New York: Macmillan.
Skinner, B. F. (1957). Verbal behavior. New York: Appleton-Century-Crofts.
Skinner, B. F. (1990). Why I am not a cognitive psychologist. Behaviorism, 5, 1-10.
Staddon, J. E. R., & Simmelhag, V. G. (1971). The “superstition” experiment: A reexamination of its implications for the principles of adaptive behavior. Psychological Review, 78, 3-43.
Timberlake, W. (2004). Is the operant contingency enough for a science of purposive behavior? Behavior and Philosophy, 32, 197-229.
Timberlake, W., Schaal, D., & Steinmetz, J. E. (2005). Relating behavior and neuroscience: Introduction and synopsis. Journal of the Experimental Analysis of Behavior, 84, 305-311.
Williams, B. A. (2001). The critical dimensions of the response-reinforcer contingency. Behavioural Processes, 54, 111-126.
Wixted, J. T., & Gaitan, S. C. (2002). Cognitive theories as reinforcement history surrogates: The case of likelihood ratio models of human recognition memory. Animal Learning & Behavior, 30, 289-305.
Zentall, T. R. (2001). The case for a cognitive approach to animal learning and behavior. Behavioural Processes, 54, 65-78.