Developing Behaviors and Eliminating Behaviors: Reinforcement Schedules & Extinction

By reinforcement schedule, we mean, how often the trainer will give primary reinforcement (treat) following a correct behavioral response to the command given. In the literature, you will find five kinds of reinforcement schedules:

(1) Fixed Interval
(2) Variable Interval
(3) Fixed Ratio
(4) Variable ratio
(5) Random

Fixed Interval means that the primary reinforcement will recur after a fixed amount of time, or every 30 seconds, or every five minutes. If you work a salaried job for a living, the typical example is your paycheck arrives every two weeks, on a fixed interval reinforcement schedule.

Variable Interval means that the primary reinforcement will recur on a varying schedule, sometimes after 10 seconds, then 51 seconds later, 3 minutes later, and again at 8 seconds later. If you are an entrepreneur, your cash flow into the business arrives on a variable interval, i.e. whenever you make a sale and collect the receivables.

Fixed Ratio means that a behavior performed correctly n times, will be one primary reinforcer given nth time. So a 1:5 fixed ratio means that every 5th properly performed behavior will be given primary reinforcement. When we say in this book the dog is on 100% reward, we are meaning a 1:1 fixed ratio, every correct performance gets primary reinforcement. This will generally lead to poor performance as the ratio increases, because the animal will learn that the first two performances are not rewarded.

Variable Ratio means that primary reinforcement is given on an average number of correct responses. Thus a variable ratio of 1:2 means that on average, one out of every two correct responses will be given primary reinforcement. This is what we refer to as the Variable Reward. Technically we mean a Variable Ratio of Reinforcement. In our training program, upon the transition to variable reward, we begin with a high frequency (a high average ratio 1:3 perhaps) and as the dog progresses we phase the primary reinforcement out by going to a lower frequency (a low average ratio of correct behaviors receiving primary reinforcement 1:15 perhaps) during any training sequence. Slot machines are an example of variable ratio reinforcement. If they never paid out, we wouldn’t try. But since we all have some experience with winning sometimes, we try hard (spend a lot of money) to get the reinforcement! Vegas was built on variable reinforcement!

Random reinforcement refers to their being no relationship between the behavior performed and the primary reinforcement given. Nothing is generally learned from random reinforcement.

The opposite of developing behaviors is that of extinction. If a behavior that was heavily reinforced in the past, no longer receives any reinforcement (not primary or secondary) the behavior might extinguish, and this is what we call behavioral extinction. A variable ratio reinforcement schedule will tend to make the behavior less vulnerable to extinction, because the elimination of the reward (from the dog’s point of view) probably only means that the dog must work a few more times to receive the reinforcement, thus we are likely to see improved effort rather than no effort (this is called an extinction burst). A good example is going to a drink machine. The dollar bill acceptor is usually on a variable reinforcement schedule. After some fussing on your part the bill normally gets accepted, so you don’t expect it to work the first time. If you got to the machine and it does not accept the bill the first time, you try again, and again, harder and harder (extinction burst). After a time, if it doesn’t work still, you go to another machine (extinction). Remember this example when you go to the drink machine next time!

However, if the behavior had been rewarded on a fixed ratio of 1:1, and the reinforcement stops (like at the transition point where we move from reward to variable reward in our system) we are likely to get non-compliance due to immediate extinction. Thus, behaviors that are reinforced on a 1:1 fixed ratio are easier to extinct, than behaviors that are reinforced on a variable ratio. Thus it becomes imperative to transition to reward your dog on a variable ratio to make the behavior reliable!