Operant Conditioning Research Paper

This sample Operant Conditioning Research Paper is published for educational and informational purposes only. If you need help writing your assignment, please use our research paper writing service and buy a paper on any topic at affordable price. Also check our tips on how to write a research paper, see the lists of psychology research paper topics, and browse research paper examples.

If there is a single topic within psychology with which you are familiar, it is surely learning. After all, you have spent much of your life doing just that. As a child, you learned how to walk, how to talk, and how to get along (or not get along) with others. In school, you learned English, algebra, biology, geometry, and world history, among other things. And as an adult, you have continued to learn new things about yourself, your interests, and the world. With all that experience as a learner, you might think that you know everything there is to know about learning. But can you define what learning is? Can you describe how you have learned all the things that you have learned? Probably not—our experience as learners generally teaches us what we have learned but not how we have learned.

Understanding what learning is complicated by the fact that there are so many ways that learning occurs. Nonetheless, psychologists generally define learning as a relatively permanent change in behavior and thinking based on experience. Learning cannot be observed directly; typically, psychologists infer that learning has occurred by noticing particular changes in behavior. However, behavior change is not by itself learning—it reflects only the possibility that learning has occurred. Thus, psychologists must be careful not to assume that changes in behavior always reflect learning. Consider, for example, how your behavior changes depending on whether you are wide awake or drowsy. Your arousal level may determine how well you perform on an examination, or the skill with which you execute a skilled activity such as playing a musical instrument or catching a baseball. Changes in behavior caused by arousal level, fatigue, ingestion of drugs, or emotional problems do not necessarily reflect learning.

Likewise, learning may occur without noticeable changes in behavior. In some cases, learning is not apparent, at least not right away, from changes in our observable behavior. In other cases, you may never have the opportunity to demonstrate what you have learned. For example, although you most likely learned how to change a flat tire in your high school driver’s education course, your behavior will not reflect that learning unless you have the actual opportunity to change a flat tire. In still other cases, you may not be sufficiently motivated to demonstrate things that you have learned. Such might be the case when a teacher poses a question to the class: although you may know the answer, you do not say it because you just don’t feel like raising your hand or you get nervous when speaking publicly.

As you can see, learning and behavior are closely but not perfectly related. Indeed, it is unusual for psychologists to speak of one without also speaking of the other. Psychologists study the relation between learning and behavior by manipulating the environment in which behavior occurs—that is, experience. In this research-paper, we will focus on a particular form of learning—operant learning, which is better known as operant conditioning. We will outline the basic historical background of operant conditioning as well as the key terms, ideas, and theory relevant to understanding this very important type of learning.

Brief Historical Overview

Although operant learning (or conditioning) has existed since the beginning of human history, the study of the basic principles of operant conditioning is only about 100 years old. In fact, much of what we know today about operant learning resulted from the remarkable research of E. L. Thorndike and B. F. Skinner, who both designed clever experiments investigating how the consequences of one’s actions influence the subsequent acquisition and maintenance of behavior.

E. L. Thorndike

Edward Lee Thorndike (1898-1911) enjoyed a prolific professional career, which included the study of children’s mind-reading abilities, instinctive behavior in chickens, and how best to teach reading and spelling to schoolchildren (Thorne & Henley, 2001). However, Thorndike is perhaps most noted for his work in which he identified key principles of learning that turned out to be very similar to what we now call operant conditioning. Thorndike studied the abilities of cats to escape from puzzle boxes that he constructed from wood. Each puzzle box required a specific response or set of responses to be emitted in order for the cat to escape the box. During each trial, Thorndike watched each cat inside the box and carefully recorded what he observed. Over repeated trials, the cats learned to escape the box with increasing speed.

For Thorndike, the decreased latency between entry to the puzzle box and subsequent escape over repeated trials was indicative of improved performance, or learning. From this research, Thorndike developed his famous Law of Effect, which states that responses that produce satisfaction will be more likely to recur and thus be strengthened (Thorndike, 1911). In his own words:

Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more formally connected with the situation, so that, when it recurs, they will be more likely to recur. (p. 244)

Likewise, a response followed by discomfort will be unlikely to recur and thus be weakened. Although Thorndike’s work has been greatly advanced and improved in subsequent research, the law of effect remains an important one for understanding how operant learning occurs. As you read about Skinner’s work with operant conditioning below, you will surely see Thorndike’s influence on the field today.

B. E Skinner

Like Thorndike, Burrhus Frederick Skinner (19041990) did his graduate work at Harvard University. Skinner championed the laboratory study of the law of effect and advanced its application to the study of the human condition and its attendant problems. He devised ingenious and rigorous laboratory methods for studying learning, invented unique laboratory apparatuses and methods for observing behavior, and created his own philosophy for interpreting it (Bolles, 1979).

Unlike many of his predecessors, Skinner believed that behavior should be examined for its own sake, and not only in terms of the underlying physiology or mental events that purportedly cause behavior. For Skinner, changes in behavior depend wholly on changes in environmental events. Skinner studied simple behavior under highly controlled laboratory conditions—for example, rats pressing a lever or pigeons pecking a plastic disk to access food. Skinner studied his subjects for long periods of time, as opposed to studying large numbers of animals for short periods of time. Skinner’s brand of research, the scientific study of behavior, is known as experimental analysis of behavior. The application of knowledge gained from the experimental analysis of behavior to solving socially relevant issues is called applied behavior analysis, which is among the most vibrant fields of applied psychology today.

Skinner’s (1938) The Behavior of Organisms presented one of his first articulations of what we now know as operant conditioning. By studying individual organisms in controlled laboratory apparatuses, called operant chambers, Skinner identified ways in which behavior may change as a result of manipulating environmental consequences. He also found that different schedules of the availability of consequences produced different effects on behavior. Finally, Skinner (1950) suggested that it is possible to study behavior and the processes of learning without a theory—a notion perhaps more radical than the notion of studying behavior for its own sake. The scope of operant conditioning has widened from studying the behaviors of Skinner’s rats and pigeons to an array of complex human behaviors, such as reducing problem behaviors in children, making the workplace safer, and understanding and combating drug-use behavior.

In addition to his contributions to behaviorism and operant conditioning, Skinner’s work led to the founding of two important psychological journals: the Journal of Experimental Analysis of Behavior (1958) and the Journal of Applied Behavior Analysis (1968). In addition to the well-known operant chamber, Skinner also invented several other novel apparatuses, including the cumulative recorder, the baby-tender, and the teaching machine. Finally, Skinner produced many notable writings, which remain influential today in the field of psychology. Among such writings are The Behavior of Organisms (1938), Schedules of Reinforcement (with C. B. Ferster, 1957), and Verbal Behavior (1957). One book, Walden Two, describes a Utopian society based on the application of operant principles to everyday living (Skinner, 1948).

Operant Conditioning

What Is an Operant?

The term operant refers simply to a behavior that is changed as a result of its consequences (Millenson & Leslie, 1979). The term conditioning refers to a specific type of learning. Thus, operant conditioning is learning produced as a result of behavioral consequences. More specifically, operant conditioning is behavior change that results from the association between or among responses and their consequences. The more frequently an operant occurs, the stronger it becomes (Skinner, 1938).

The Three-Term Contingency

Behavior does not occur in a vacuum; certain environmental events usually precede and follow behavior. Consider, for example, the practice of answering the telephone. The phone rings, you pick it up and say “hello” into the mouthpiece, and a person on the other end of the line begins to speak. How many times have you picked up the phone when it was not ringing and said “hello”? That would be a foolish thing to do, simply because no one is on the other end of the line with whom to speak. Certainly, your family, friends, or roommates would think you quite strange for this sort of behavior. We answer the phone (the behavioral event) because it rings (the preceding event) and because in the past someone has been at the other end of the line with whom to talk (the consequence, or following event). Skinner referred formally to the relationship among these three events as the three-term contingency. More specifically, the three-term contingency involves the following:

  • The preceding event, or discriminative stimulus, which “sets” the occasion for responding because in the past that response has produced a certain consequence in the presence of that specific stimulus.
  • The behavioral event or the response.
  • The following event or consequence is dependent on the response. If the response does not occur, a consequence is not produced.

Thus, operant conditioning occurs in the presence of certain stimuli and is always followed by certain consequences. In the presence of a discriminative stimulus, a consequence will occur if and only if a response first occurs. The three-term contingency permits psychologists to understand any behavior that may occur in one context or situation (i.e., when a discriminative stimulus is present) but not in another (i.e., when the discriminative stimulus is not present). There are two basic types of consequences that may influence an operant: reinforcement and punishment.

Types Of Behavioral Consequences


Reinforcement occurs when an event results in the increased probability that the preceding behavior will recur on a future occasion. Or, put another way, reinforcement occurs if an event following a response increases subsequent responding. If a behavior increases following the addition of a stimulus (event or consequence), then that stimulus is called a reinforcer. A reinforcer is only available contingent upon a response—reinforcement occurs if and only if the organism emits a specific behavior. Although this principle may seem difficult to understand at first, the idea is one with which you are quite familiar. For example, if you are cold (a discriminative stimulus) and you put on a coat (the behavior), then you begin to feel warm (a reinforcing consequence). On future occasions when you feel cold, you will be more likely to put on a coat to help you feel warm because in the past you learned the association between the behavior of putting on your coat and warming up. In this case, the behavior of putting on a coat is rein-forcing—warming yourself up is contingent upon putting on your coat. When a stimulus is added, resulting in an increase in behavior, we call that positive reinforcement. In operant conditioning, the term positive always refers to the addition of a consequential stimulus and does not imply a good or desirable consequence, although, to be sure, some added consequences are desirable.

Let’s consider a second example of positive reinforcement. Suppose you tell a child to clean her room (the discriminative stimulus), and after she has cleaned the room (behavior), you give her a lollipop (a potential reinforcer). As a result, she is more likely to clean up her room in the future when asked because in the past she has received a lollipop for doing so. In this case, the lollipop served as a reinforcer for room-cleaning behavior.

In addition to adding a consequential stimulus to increase behavior, we can also subtract (or remove) a consequential stimulus to increase behavior. For example, suppose that you are wearing a coat and you become too warm. Of course, removing the coat will cool you off, which in this case is a pleasant condition for you. In this example, taking away the coat results in a consequence that increases the probability that you will remove your coat on future occasions when you feel hot while wearing it. The removal of a stimulus that results in an increased behavior is called negative reinforcement. The term negative in operant conditioning always refers to the subtraction or reduction of a consequential stimulus and does not imply a bad or undesirable consequence. Consider another example of negative reinforcement. Suppose you have a headache. You have certainly learned by now that when you have a headache, you can take an over-the-counter pain medication such as aspirin to quell or at least lessen the headache. Taking an aspirin (behavior) in the presence of a headache (the discriminative stimulus) results in its removal or reduction (the consequence), making it very likely that when you get a headache in the future, you will take aspirin (or some other pain reliever) again. The likelihood of taking aspirin again is increased as a result of subtracting the headache.

Keep in mind that it is all too easy to confuse positive and negative reinforcement. Here is a rule that will help you keep the two straight: Both positive and negative reinforcement increase the likelihood that a given behavior will recur. However, positive reinforcement involves the addition or presentation of a consequential stimulus, and negative reinforcement involves the subtraction, removal, or reduction (and sometimes the prevention) of a consequential stimulus.


Punishment occurs when an event results in the decreased probability that the preceding behavior will occur on a future occasion: If an event following a response decreases subsequent responding, punishment has occurred. The stimulus that produces the decrease in behavior is called a punisher, and the decreased behavior is said to be punished. Punishment is also a result of a contingency—punishment will occur if and only if a specific behavior is emitted. Like reinforcement, punishment may also be either positive or negative. For example, suppose that you use more minutes than your cell phone plan allows for the month. As a result of your using extra minutes, the cellular company charges your account a fee for each additional minute. Additional charges decrease the likelihood that you will go over your minute limit in the future. In this case, using too many minutes is punished. The extra charges are contingent upon use of too many minutes and occur if and only if you use too many minutes. Notice that in this example, a consequential stimulus (the fee) is added to decrease behavior. A decrease in behavior as a result of adding a stimulus is called positive punishment.

Let’s consider another example. Suppose a mother tells her child not to touch his grandmother’s valuable collection of antique vases just prior to a visit to the grandmother’s house. Despite the warning, the child picks up a vase and accidentally drops it, and the vase smashes into a thousand pieces all over the floor. Immediately, the mother scolds him. Chances are that the next time he visits his grandmother, he will refrain from touching the vase collection—his behavior of touching the vases will decrease as a result of the scolding (the addition of the scolding lessens the probability of vase touching in the future).

In addition to adding a consequential stimulus in order to decrease behavior, a consequential stimulus can also be subtracted to decrease behavior. For example, suppose that your cellular phone drops calls each time you walk into your kitchen. Dropped calls when you enter the kitchen decrease the likelihood that you will enter the kitchen while talking on your cell phone in the future. Notice that in this example, a consequential stimulus (losing a connection with the person with whom you were talking) was subtracted and produces a decrease in behavior (talking on your cell phone in the kitchen). A decrease in behavior as a result of subtracting a consequential stimulus is called negative punishment.

Consider another example. You are a high school student, and your parents have purchased tickets for you and a friend to see your favorite band in concert. Your father asks you to do your homework before bed one evening, and you refuse. As a result, your father takes away your tickets to the concert. On future occasions, you are more likely to obey your father’s requests. In this case, your disobeying behavior is decreased by subtracting a consequential stimulus.

Note that psychologists always define reinforcement and punishment functionally—in terms of their effects on behavior. Consider a student whose teacher gives her a bonus point on an assignment for turning it in on time. If, on future occasions, she turns assignments in late, the bonus point did not reinforce or increase punctual behavior. Thus, even though the student received the bonus point, it had no effect on punctuality—the bonus point did not serve as a reinforcer. This example may seem counterintuitive, but it is illustrative of the importance of understanding how reinforcement and punishment are functionally defined. We can never know ahead of time if a stimulus or event is a reinforcer or punisher (or neither). We must always wait until after we have seen its effects on behavior before we can label it.

A second caveat is also in order: Psychologists restrict the use of the terms reinforcement and punishment to describe only environmental operations and their effects of the behavior of organisms. Behavior, not the organism, is said to be reinforced or punished.

The relationship between behavior and its consequences is often far more complex than can be described in this research-paper. Indeed, some psychologists spend their entire careers trying to understand this complex relationship. In what follows, we will describe the most basic elements of operant conditioning by focusing primarily on positive reinforcement.

Basic Principles Of Operant Conditioning

Response Acquisition Through Shaping

How does a discriminative stimulus come to control behavior? How does an organism learn the relationship among discriminative stimuli, its behavior, and consequences? In short, how is behavior acquired? Most behavior is acquired gradually through the organism’s interaction with reinforcing and punishing events in its environment. Skinner developed a technique for teaching new behavior to his nonhuman animal subjects called shaping. Shaping involves reinforcing any behavior that successfully approximates the desired response. Shaping maximizes the likelihood that the desired, or target, behavior will occur. As an organism’s behavior more closely resembles the desired behavior, reinforcement is made available; however, if the organism’s behavior differs from the target behavior, then reinforcement is not available.

Consider an example. Suppose you have a dog that you wish to “teach” to sit up on its hind legs. Although your dog does not spontaneously sit up on his hind legs, he will emit other behaviors such as sitting down on his hind legs, jumping up, or drawing his paws into the air in front of him. Each of these behaviors is not the target behavior, but each one is in some way similar to the target behavior or an approximation of the target behavior. Thus the process of shaping behavior is one that takes advantage of natural variability in behavior. Without variability, shaping could not occur, for behavior would never differ in any way to approximate a new response.

Let’s get back to our example. You might present your dog with a tasty treat if he looks toward you when you say the words “sit up.” Once that response is mastered, then you may present a treat only if the dog looks toward you and then sits. Once he is looking toward you and is sitting, you may reinforce a new, closer approximation of the target behavior—sitting up on his hind legs. If your dog fails to sit (but remains standing) on the next occasion on which you ask him to sit up, you would not present a treat because his behavior was not any more similar to the target behavior. However, if the dog looks at you while sitting and then makes any motion that resembles lifting its front legs off the floor, you present the treat. Gradually, each time you command “sit up,” your dog will come closer and closer to the target behavior of sitting on his hind legs with both paws in front of his chest. The use of a reinforcer (e.g., a treat) shapes the dog’s behavior of sitting up.

Let’s walk through another example, this time featuring human behavior. Suppose you want your friend to be punctual each time she is to meet you. Being blunt, and getting upset that she is late all the time, is not always the most effective way to change behavior while also trying to maintain a relationship. In this case, shaping represents an effective alternative method for changing her behavior. Suppose that your friend is typically about 15 minutes late to meet you and on one occasion she is only 10 minutes late. On this occasion, offer a potential reinforcer—perhaps simply by telling her how much you appreciate that she was almost on time. The next time you meet your friend, offer reinforcement if she is less than 10 minutes late, and so on until your friend is on time. Who knows—maybe she will even be early someday!

As you may have guessed, the trick to shaping is to identify an effective reinforcer. Recall that not all potential reinforcers turn out to be reinforcing—only those that increase behavior will be true reinforcers. If you buy your friend a soda when she is on time, and she does not continue to be on time on future occasions, the soda did not serve as a reinforcer. She may prefer verbal praise or a smile or a hug. Identifying which potential reinforcers may actually shape behavior is generally a trial-and-error process—but the better you know a person, the more you will know about the things she likes or doesn’t like.

Shaping is a useful procedure both in the laboratory and in the real world. Skinner shaped rats to press levers and pigeons to peck colored disks to gain access to food in an operant chamber. Parents use shaping to teach their children to lace their shoes, get dressed, use language, catch a ball, ride a bicycle, and many other behaviors.

One advantage of the shaping process is that behavior can change in response to the environment without the need to use verbal instructions telling the organism what to do. In effect, reinforcement “tells” the organism how to behave. Generally speaking, reinforcement and its absence are used in shaping—punishment is not.

Response Discrimination

In many instances, only one aspect of behavior produces reinforcement in a particular circumstance, and all others do not. This process of an organism learning which aspects of a response produce reinforcement is called response discrimination. It develops as the organism learns that only certain features of a response such as its speed, accuracy, and frequency, are reinforced in any particular context. For example, when teaching your dog to come to you when you utter the command “come,” you might give it a treat if and only if it runs toward you. You do not provide it a treat for walking or crawling toward you.

Response Generalization

In some cases, different aspects of the same behavior may produce reinforcement in the presence of the same context. This process is called response generalization. In contrast to response discrimination, response generalization occurs when more than one feature of behavior will produce the same reinforcer. For example, suppose that you wish to teach your dog to come toward you when you say the word “come.” Each time it does, you give it a treat—but it doesn’t matter how the dog approaches you. It could walk or run or even dance its way toward you—it still approaches you when you command it to come.


Any procedure in which a reinforcer is no longer presented following a response is called extinction. Behavior that is no longer reinforced decreases in frequency until it stops occurring altogether. Such behavior is said to extinguish because it is no longer emitted. Suppose you put money into a vending machine to purchase a soft drink and nothing comes out. Will you keep putting money into the machine? Not likely. One interesting consequence of extinction is that it initially produces response variability. When the soft drink machine is working properly, your behavior is simple. You drop several coins into the machine, push a particular button, and out comes your ice-cold soft drink. But what happens when you put your money into the machine, push the button, and nothing happens? Now your behavior changes: You may press the button harder or start pushing down on the coin return lever. If these behaviors fail to either produce your soft drink or return your money, you may start pushing or hitting the machine, or perhaps even yell at it, as if it could understand human language and feel guilty for not giving you what you wanted! Eventually, you stop engaging in all these behaviors and walk away from the mechanical bandit. Thus, extinction usually produces an increase, then a decrease, in behavior.

Spontaneous Recovery

Spontaneous recovery occurs when a previously extinct behavior reemerges, even without further association of a reinforcing or punishing consequence with the target behavior. For example, suppose a few days have passed since you lost your money in the vending machine. You decide to give it another try, on the spur of the moment, even though it “robbed” you last time. Such behavior following extinction is called spontaneous recovery. Interestingly, if the machine is now working properly, and you receive your soft drink after you pay for it and press the button, your “vending machine” behavior is actually strengthened.

Continuous And Intermittent Reinforcement

Broadly speaking, reinforcement of a response can occur continuously or intermittently. Behavior that is reinforced each and every time it occurs is said to be reinforced continuously. Behavior that is only occasionally reinforced is said to be reinforced intermittently. Not surprisingly, these two reinforcement processes have different effects on behavior. Skinner (1956) first demonstrated this finding when he was running low on food supplies for his rats and he attempted to conserve food by only occasionally reinforcing his rats’ responding instead of after each response. He discovered that this procedure had two effects. First, of course, it increased the longevity of his food supply. Second, and more important, compared to continuous reinforcement, intermittent reinforcement produced large amounts of responding in his rats. He also later found that during extinction conditions, rats whose responding was intermittently reinforced produced a greater number of responses during extinction conditions than did rats whose behavior had been continuously reinforced. Thus, training under intermittent reinforcement made his rats’ behavior more resistant to extinction—or more durable than training under continuous reinforcement.

Reinforcement Parameters

Several important variables influence the efficacy of potential reinforcers for maintaining learned behavior. For example, reinforcement history or a history of experience with specific reinforcers is a powerful determinant of behavior. The more often you encounter the same association of a specific behavior (e.g., putting on a coat) and its specific consequence (e.g., feeling warm), the stronger your reinforcement history is likely to be. A strong reinforcement history with any particular reinforcing stimulus increases the likelihood that a behavior will occur on future occasions. Other factors that influence how reinforcement influences behavior include rein-forcer magnitude, response requirements to gain access to reinforcers, and the schedule of availability of potential reinforcers.

Reinforcer Magnitude

Reinforcer magnitude refers to size or amount (e.g., frequency or duration) of a particular reinforcing stimulus or event. In general, the greater the magnitude of a reinforcing stimulus, the more likely it will influence behavior upon its presentation. This concept makes sense when you stop and think about the wages people are willing to work for in their jobs. Most people would rather make more money than less money—so a job that pays $20 per hour is more attractive than a job that pays only minimum wage.

Response Requirements

Response requirement refers to the amount of behavior (e.g., the number of responses, intensity of responses) that must be emitted to obtain a reinforcer. In general, the lower response requirement required to produce a reinforcer, the more likely an organism is to engage in the behavior necessary to acquire it. For example, consider a study in which adult men pulled a lever to gain access to alcohol (Van Etten, Higgins, & Bickel, 1995). Participants could gain access to alcohol after lever pulling from a low response requirement of 100 times to a high requirement of 1,600 times. Alcohol served only as a reinforcer for lower response requirements (e.g., less than 800 lever pulls) but not for high response requirements. Thus, some response requirements (lower requirements) do not affect the efficacy of reinforcers and other response requirements (higher requirements) render potential reinforcers ineffective for changing behavior.

Schedules of Reinforcement

Your own experience has no doubt taught you that there is not always a one-to-one correlation between your behavior and its consequences. Sometimes your behavior is reinforced by its consequences and sometimes it is not. Dialing the telephone sometimes results in a busy signal, sometimes in no answer, and sometimes in the expected “hello.” Sometimes you get away with exceeding the posted speed limit on the freeway and sometimes you do not. Depending on the behavior, reinforcement and punishment often vary in frequency over time. To study the effects of such variation, psychologists have devised schedules of reinforcement that specify the conditions that must be satisfied before an organism’s responding is reinforced (Ferster & Skinner, 1957). These conditions are usually based on the passage of time, the number of responses emitted by the organism, or some combination of both.

Skinner’s discovery of intermittent reinforcement in maintaining behavior stimulated an enormous amount of research on the effects of different kinds of reinforcement schedules on human and animal behavior. Indeed, many pages of the Journal of the Experimental Analysis of Behavior are filled with detailed information about the precise ways in which these schedules influence behavior. Four kinds of schedules—the so-called simple schedules— have received the most research attention. Skinner and others created these schedules by combining fixed or variable delivery of reinforcement with response-based or time-based delivery of reinforcement. Hence, their names: fixed-ratio, variable-ratio, fixed-interval, and variable-interval. That these four schedules have different effects on behavior is evidenced by the fact that the same animal will produce different patterns of responding under each schedule. Let’s look at each of these schedules in a bit more detail.

Fixed-Ratio (FR) Schedules

Fixed-ratio or FR schedules require that reinforcement be delivered only after the organism emits a fixed number of responses. Thus, in an FR 10 schedule, every 10th response is reinforced. As you might expect, FR schedules produce high rates of responding because the rate of reinforcement is directly correlated with the amount of responding: The greater the number of responses the organism emits, the greater the number of reinforcements it receives. Organisms responding under FR schedules typically pause briefly after receiving reinforcement but then respond rapidly until they receive the next reinforcer.

Variable-Ratio (VR) Schedules

The response requirement for reinforcement under variable-ratio or VR schedules varies from one reinforcer delivery to the next. For example, in a VR 100 schedule, the reinforcer may be delivered after the 50th response one time, after the 150th response the next time, and after the 100th response the following time. Thus, in a VR 100 schedule, the average number of responses to produce a reinforcer is 100. VR schedules produce higher rates of behavior than do FR schedules—and that behavior is highly resistant to extinction.

Fixed-Interval (FT) Schedules

With fixed-interval or FI schedules, only the first response that occurs after a fixed amount of time has elapsed is reinforced. Thus, in an FI 20-second schedule, the first response after each 20-second interval has elapsed since the previous reinforcer delivery (or the start of the experimental session) is reinforced. All other responses have no effect on the delivery of reinforcement. For many animals, FI schedules produce a “scalloped” pattern of responding: immediately after reinforcement there is no responding—there is a post-reinforcement pause in responding. As time passes, though, responding increases gradually to the point of the next reinforcer delivery.

Variable-Interval (VI) Schedules

Like VR schedules, the response requirement for reinforcement under variable-interval or VI schedules varies from one reinforcer delivery to the next. Unlike VR schedules, though, the criteria for reinforcement is more time-based than response-based. For instance, in a VI one-minute schedule, responses are reinforced, on average, every one minute since the last reinforcer delivery. Some reinforcers are delivered after short intervals and some are delivered after longer intervals, producing a moderate steady rate of responding. In all cases, though, the first response that the organism emits after the interval times out produces a reinforcer. Behavior learned under VI schedules, like behavior learned under VR schedules, is highly resistant to extinction.

Although it is tempting to look for these sorts of reinforcement schedules in everyday life, they are more likely to be found within the confines of the controlled laboratory environment. Nature, it seems, does not parcel out its reinforcers in neat little packages that arrive according to one specific schedule (Crossman, 1983). For example, although most people receive a paycheck after fixed intervals of time, they generally do not pause before continuing to work—if they did, they would get fired! Too many other factors including peer pressure, natural enjoyment of work, and fear of being fired make it unlikely that a post-reinforcement pause will occur following getting paid. Nonetheless, schedules of reinforcement have been tremendously useful tools to psychologists in understanding how specific environmental events influence behavior (Mazur, 2002).

Other Reinforcement Schedules

In addition to the simple schedules, there are also many more complex schedules of reinforcement. For example, in concurrent schedules, reinforcement is available under two (or more) schedules operating independently of each other. The most common concurrent schedule is the concurrent VI VI schedule. For example, imagine a pigeon in an operant chamber with two colored disks that may be pecked to produce access to grain. The left disk operates on a VI 20-second schedule, while the right disk operates on a VI 60-second schedule. The pigeon may respond by pecking on either disk for grain because both VI schedules are available concurrently. Because concurrent schedules generally operate independently of one another, responding on one schedule does not influence the availability of reinforcement on the other schedule. Suppose the pigeon in our example responded on each disk—pecking the left disk would not change the availability of grain on the right disk.

Concurrent reinforcement schedules produce a fascinating behavioral phenomenon called matching. In our example of the pigeon responding under the concurrent VI 20, VI 60 schedule, the pigeon is highly likely to respond three times more on the VI 20 schedule relative to the VI 60 schedule because the ratio of reinforcement is three to one across the two schedules (for every minute that passes under these two schedules, reinforcement on the VI 20 schedule is three times more likely than reinforcement on the VI 60 schedule). Matching has received intense empirical scrutiny since it was first discovered in the early 1960s by Richard Herrnstein (1961, 1970; for a detailed review, see Davison & McCarthy, 1988)

Up to this point, our discussion of concurrent schedules suggests that the reinforcement available on each schedule is always the same. It does not have to be: Each independent schedule may have a different type of reinforcement associated with it. For example, a pigeon may peck the right disk to gain access to grain and peck the left disk to gain access to water. The responses to each disk are the same (pecks on the right and pecks on the left), but in this case, the reinforcers are different.

Other types of complex reinforcement schedules include multiple schedules and mixed schedules. Multiple schedules of reinforcement are similar to concurrent schedules except that the schedules of reinforcement are in effect one at a time, and a specific stimulus indicates when each is available. For example, a pigeon behaving under a multiple schedule for access to grain might see a green light when an FI five-second schedule is in effect and a red light when a VR 20 schedule is in effect. The lights (i.e., red and green) indicate which schedule is in effect. Thus the lights serve as discriminative stimuli. Mixed schedules of reinforcement are functionally the same as multiple schedules, except that there is no discriminative stimulus indicating which schedule is in effect.

Conditioned Reinforcement

To this point, we have discussed reinforcement only in terms of primary reinforcers—biologically relevant stimuli such as food, water, and painful events. However, as you know from personal experience, behavior can also be maintained with a wider variety of stimuli—money, a smile, a hug, kind words, a pat on the back, and prizes and awards. These stimuli acquire their reinforcing properties through their association with primary reinforcers and are called conditioned reinforcers. Because it can be exchanged for so many different kinds of primary reinforcers in our society, money is by far the most common conditioned reinforcer among humans. That money is surely a conditioned reinforcer can be substantiated by asking yourself a simple question: would you continue to work if money could not be exchanged for food, water, shelter, medical services, and other necessities of life?

Conditioned reinforcement is the principle on which token economies are based. A token economy is a system used in some mental institutions and other applied settings, such as schools, to engender desired behavior. For example, in a mental institution, a resident might be given small plastic tokens for emitting desired behavior such as that involved with personal hygiene and social interaction. These tokens are later exchangeable for different goods and services. In the classroom, students who engage in appropriate behavior may earn tokens that are exchangeable for snack items and school supplies.

Stimulus Control

You may recall from our earlier discussion of response discrimination and response generalization that sometimes only a particular type of response will produce reinforcement in the presence of a discriminative stimulus, and sometimes variations of this response will produce reinforcement in the presence of the discriminative stimulus. In this section, we consider stimulus discrimination and stimulus generalization, or what is better known as stimulus control. This term implies that environmental stimuli exert strong control over behavior because of the consequences that are associated with them.

Stimulus Discrimination

Basically, stimulus discrimination is the process of learning about which stimuli indicate the potential for reinforcement, and which do not. Another way to think about stimulus discrimination is that behavior changes depending on the context. Suppose a pigeon receives grain only for pecking a green disk and not for pecking a red disk. The pigeon quickly learns to peck at the green disk and not to peck at the red disk. Thus, the pigeon learns to discriminate between the two different colored disks. Consider another example. Suppose that a smile on your mother’s face indicates that she is likely to give you money if you do the dishes. The money is available if, and only if, the smile is present and you do the dishes. If you do the dishes and the smile is not present, then she is not likely to offer you any money for doing the dishes. If you wash the dishes only when your mother is in a good mood (i.e., smiling), you have discriminated among her facial expressions to determine which ones indicate the likelihood of monetary reinforcement and which do not. Stimulus discrimination occurs when an organism behaves (i.e., does the dishes) in the presence of a particular stimulus (e.g., a smile) but not in the presence of other stimuli (e.g., a frown or a scowl). Stimulus discrimination results in specific behaviors in response to the presence of a specific discriminative stimulus.

Let’s consider one last example of stimulus discrimination. In order for your dog to learn that a treat is available for sitting up only when you utter the words “sit up,” you must give your dog the treat only when you say those words—they constitute the discriminative stimulus. If the dog lies down, rolls over, barks, puts its nose in your lap, or does anything else, you would not give your dog a treat. After repeated trials of sitting up following your command to “sit up” and following such behavior (and only such behavior) with a treat, the dog will learn to sit up only when you give this command.

Stimulus Generalization

Stimulus generalization refers to the process of determining which stimuli, in addition to a specific discriminative stimulus, may also function as a cue for the availability of reinforcement. The closer stimuli resemble the discriminative stimulus on which they were originally trained, the more likely organisms are to respond to those stimuli. Another way to think about this process is that learning in one situation influences how you behave in similar situations. Consider again a pigeon in an operant chamber. We have trained it to peck a green disk to access grain. What would happen if we presented a greenish-blue disk to the bird? Perhaps it would peck at it because the disk resembles the green disk we originally trained it to peck. What if we present the pigeon with a blue disk? If the pigeon pecks the green-blue disk and the blue disk, then we can safely say that stimulus generalization has occurred.

Similarly, we often observe (sometimes amusing) confusion when we try to teach children to learn to name different things such as animals. For example, we might be trying to teach a child to say the word “dog” each time it sees a dog. In a short while, the child will learn to say the word “dog” each and every time she sees a dog. But then, suppose we are out in the country riding in our car when the child spots a cow and spontaneously calls it a “dog.” In this case, the child has generalized the word “dog” to apply to another four-legged creature—a cow. (Now, our task as a teacher becomes one of teaching the child not to generalize, but to discriminate, so that she learns to say “dog” each time she sees a dog, and to say “cow” each time she sees a cow.)

Stimulus discrimination and generalization don’t just occur with the family pet and with learning to name animals. Such behavioral processes are ubiquitous in human society. For example,

  • we cross the street when the traffic light is red and vehicles have stopped on the street we wish to cross, but not when

the traffic light is green and cars are oncoming (response discrimination);

  • we sit quietly in our seats during a lecture, church services, and theatrical presentations (response generalization);
  • we raise our hands before speaking in class, but not while partying with friends (response discrimination);
  • we say “thank you” to people who have done something nice for us (response generalization).

Let’s return to our pigeon example one more time. Suppose that we train the pigeon to peck at the green disk— pecks at this disk occasionally produce grain (let’s say we’re using a VI 60-second schedule). Suppose now that we begin to change the color of the disk, so that the basic shades of the color spectrum appear over several sessions in the operant chamber. What do you think the pigeon will do? How will it behave? Which disk colors will generate the most responding from our winged friend? Well, if we were to plot the frequency of pecking as it corresponds to different shades of the disk, we would discover that the pigeon will peck most frequently when the disk is green and next most often at the colors that are similar to green—but that it would peck much less often, if at all, when the color of the disk is very different from green—say, a bright orange or red. This plot of the frequency of behavior generated by these different colored stimuli is called a generalization gradient. More specifically, most responding will likely occur at the point of the original discriminative stimulus (i.e., green). As the shades of color vary away from green to yellow or blue, the bird will likely continue to peck at a moderate rate because yellow and blue are most similar to green on the color spectrum. Thus, as the shades of color vary further and further from green—for example, orange and red— pecking will likely be very low frequency because these colors are very different from green.

Taken together, the phenomena of stimulus discrimination and stimulus generalization offer insight into some of the ways in which behavior comes under the control of environmental stimuli. Up to this point, we have discussed behavior control and behavior change largely as a function of positive reinforcement. In our final section, we outline ways in which negative reinforcement influences behavior.

Aversive Control Of Behavior

An aversive stimulus is one that an organism will behave to avoid or escape or reduce (Fantino, 1979). You will recall that negative reinforcement is any stimulus or event that, when removed, reduced, or prevented following a response, increases the frequency of that response over time. You also will recall that punishers decrease behaviors upon which they are contingent. Typically, punishers are aversive stimuli. Aversive control of behavior is the use of aversive stimuli to control behavior—the idea is that organisms, particularly people, will act to avoid or reduce the likelihood that they receive or encounter aversive stimuli. To be sure, such control of behavior seems to pervade every nook and cranny of our society. From the spankings given to misbehaving toddlers and the fines given to speeding motorists, to assigning poor grades to subpar students and imprisoning felons for their criminal deeds, our society uses aversive stimuli to attempt to control the behavior of its citizens.

Aversive control is prominent for two reasons. First, it can be highly effective in promoting behavior change, producing nearly immediate results. A person given a hefty fine for running a stop sign is likely, at least for a short while, to heed the sign’s instruction in the future.

Second, in many cases, society has little control over the positive reinforcers that shape and maintain individual behavior. However, society has a great deal of control over aversive stimuli that can be used to punish misconduct. For example, although society has no control per se over the kinds of stimuli that positively reinforce crime (money and power), it does control the stimuli for punishing such behavior (fines, community service, jail time). Although your driver’s education teacher probably complimented you when you obeyed traffic signs, when was the last time a police officer (or, for that matter, anyone else) rewarded you for such obedience? Your answer, of course, is likely to be “never” because in this case the police do not possess stimuli that can be used to reward your good driving habits—but be assured, they surely possess the stimuli that can be used to punish your actions. The theory of good driving behavior is that drivers will act to avoid getting pulled over, receiving a fine, or being sentenced to jail.

Although aversive control of behavior is effective in reducing and in some cases eliminating undesirable behavior, it can also produce several extremely negative side effects:

  • Unrestrained use of physical force by the person doing the punishing may cause serious bodily injury to the person being punished (e.g., child abuse).
  • Negative associations become attached to the people meting out punishers; punishment often induces fear, hostility, and other unwanted emotions in the people who are being punished. It may even result in retaliation against the person doing the punishing.
  • Through punishment, the person being punished learns only which response NOT to emit—punishment does not involve teaching the person being punished how to emit the correct or desirable behavior.

Thus, although aversive control of behavior is a popular means for governing individual behavior, it has several drawbacks. And, although the effects of aversive control can be reinforcing for persons doing the punishment, the persons on the receiving end often receive very little personal benefit. This effect is one reason why Skinner advocated that society use positive reinforcement rather than negative reinforcement to govern its citizenry.


Much of our behavior, as well as the behavior of many other animals with whom we share this planet, is governed by operant conditioning. Skinner’s three-term contingency explains behavior acquisition, change, and maintenance through antecedents (discriminative stimuli), responses, and consequences. There are two basic types of consequences for behaving: reinforcement and punishment. Reinforcement occurs when some event results in the increased probability that the preceding behavior will occur on a future occasion and can be either positive (adding a stimulus) or negative (removing a stimulus). Punishment occurs when some event results in the decreased probability that a behavior will occur on a future occasion and can also be positive (adding a stimulus) or negative (removing a stimulus). Some stimuli, called primary reinforcers such as food and water are natural; other stimuli, called conditioned reinforcers such as money gain their reinforcing power by becoming closely associated with primary reinforcers.

The consequences of behavior play an integral role in the modification of responding. Shaping is a key procedure used to change behavior—it works through the application of reinforcement to successive approximations of a target behavior. Schedules of reinforcement also influence the acquisition and maintenance of behavior. Different kinds of reinforcement schedules produce different patterns of responding. The stimuli associated with reinforcement—discriminative stimuli—exert powerful control over behavior. Aversive control of behavior has been shown to be effective in controlling behavior, but it has several drawbacks that limit its overall ability to control behavior.


  1. Bolles, R. C. (1979). Learning theory. New York: Holt, Rinehart & Winston.
  2. Crossman, E. K. (1983). Las Vegas knows better. The Behavior Analyst, 6, 109-110.
  3. Davison, M., & McCarthy, D. (1988). The matching law: A research review. Hillsdale, NJ: Erlbaum.
  4. Fantino, E. (1979). Aversive control. In J. A. Nevin & G. S. Reynolds (Eds.), The study of behavior: Learning, motivation, emotion, and instinct (pp. 238-279). Glenview, IL: Scott, Foresman and Company.
  5. Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton Century Crofts.
  6. Herrnstein, R. J. (1961). Relative and absolute strength of responses as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, 267-272.
  7. Herrnstein, R. J. (1970). On the law of effect. Journal of the Experimental Analysis of Behavior, 13, 243-266.
  8. Mazur, J. E. (2002). Learning and behavior (5th ed.). Upper Saddle River, NJ: Prentice Hall.
  9. Millenson, J. R., & Leslie, J. C. (1979). Principles of behavioral analysis (2nd ed.). New York: Macmillan.
  10. Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: Appleton-Century-Crofts.
  11. Skinner, B. F. (1948). Walden two. New York: Macmillan.
  12. Skinner, B. F. (1950). Are theories of learning necessary? Psychological Review, 57, 193-216.
  13. Skinner, B. F. (1956). A case history in the scientific method. American Psychologist, 11, 221-233.
  14. Skinner, B. F. (1957). Verbal behavior. New York: Appleton-Century-Crofts.
  15. Thorndike, E. L. (1911). Animal intelligence: Experimental studies. New York: Macmillan.
  16. Thorne, M., & Henley, T. B. (2001). Connections in the history and systems of psychology (2nd ed.). Boston: Houghton Mifflin.
  17. Van Etten, M. L., Higgins, S. T., & Bickel, W. K. (1995). Effects of response cost and unit dose on alcohol self-administration in moderate drinkers. Behavioural Pharmacology, 6, 754-758.

See also:

Free research papers are not written to satisfy your specific instructions. You can use our professional writing services to order a custom research paper on any topic and get your high quality paper at affordable price.


Always on-time


100% Confidentiality
Special offer! Get discount 10% for the first order. Promo code: cd1a428655