The Power of Reward: Obedience.

By Jerry Bradshaw, Training Director, Tarheel Canine Training Inc.

Positive reinforcement is used in police K9 training to increase the likelihood of desirable behaviors. For example, we reward obedience with a ball, or a building search with a bite, or a narcotics find with a toy reward at the source of the odor. This is all well known to anyone working with police dogs that reward increases the dog’s desire to perform these tasks.

But have you really thought about reward? How many of you reward your dog in obedience for particular behaviors, in order to sharpen the quality of key skills (fast sits, quick finishes, lightning recalls), or rather do you only reward him at the end of the routine? When you are doing narcotics training, how many of you plan out where hides will be placed to reward the dog’s search pattern to create independence in searching? When doing building searches, do you send your dog in to do one building search with an alert behind a door and a reward bite, or do you vary the placement of the decoy to reward checking doorways, searching high, and do multiple repetitions of the building search skill in a given training session?

In this series of 3 articles I wish to really dissect the reward process in three different facets of police dog training: obedience, detection, and building search. This first installment will focus on obedience training.

When coming out of basic school with your young police dog, many handlers only exposure to obedience will have been a very traditional negative reinforcement based process of teaching obedience exercises. The dogs are forced into position, usually with a choke collar causing discomfort, and the discomfort is released and the handler is instructed to praise the dog once the position is achieved. In many cases this method is used because it is assumed that reward based training takes too long to develop the behaviors reliably, or tradition dictates the method. “We have always done it this way, and we have a successful program, so there is no reason to change.”

Without getting into too deep a discussion about methodology here, let me just say that often times, the result of all this compulsion is to create a dog that is not very reliable in general. In my experience, many handlers struggle with obedience training. Handlers often get bit during obedience as dogs react to the compulsion aggressively. Dogs are slow in executing commands, and often not very reliable. Handlers get frustrated. You hear a lot of this at police K9 seminars: “My dog hates obedience, he tries to get out of doing stuff when I am at a distance, and when I am close, he does it but he is slow in doing his commands. But in bite work or narcotics, he is high drive and performs great (usually except for outing).” Let’s see….he gets rewarded a lot in bite work and detection, and almost never in obedience except for after the entire routine is over….

I recall once where a relatively well known K9 officer came out to a PSA (www.psak9.org) trial I was judging to enter his dog in the level one. The level one is a mix of on and off leash obedience exercises including heeling (on and off leash patterns including change of pace, gunfire, turns, and figure 8), motion exercises, recalls to heel position while moving, and a recall and finish. The officer struggled with the dog’s performance, giving multiple extra commands, lots of hard body language and extra hand signals, and the dog blew him off numerous times for the distractions on the field. I could see he was mad at the end of the routine. On his way off the field, passing right by the critique area and headed to his car, he said to me: “police obedience is different from sport obedience!” In return I said to him, “Obedience has nothing to do with the uniform the handler is wearing. It is giving a command to a dog and having him execute it the first time without extra handler influence or excuses for non-performance.” Now while I understand that in sport there are certain quality of performance differentials such as a fast sit in motion is worth more points than a slow one, and that in police work we are more concerned with functional than pretty. But his dog was not even minimally functional in a trial environment, which is not a whole lot different from a certification environment. I will say that since that performance, there have been a number of police K9s that showed in PSA that performed admirably, so I am not attempting to generalize.

I do think, however, that many K9 handlers approach obedience as a must-do, and they use too much pressure. Many handlers do not take a lot of time to understand how to develop superior performance, and as a result their dogs do not perform as well as they could if they understood some things about using reward in obedience. Even if your dog was initially taught his behaviors compulsively, you can bring reward into his obedience life and see a transformation in how powerfully obedient the dog can be. You can develop a more independent dog, and spend less time fighting him over compliance.

Reward Type

Suppose I told all of you that have narcotics dogs to stop giving your dog his toy when he found the source of target odor, and just praise him. You would likely think that I was an idiot not worth listening to. But if you don’t reward your dog in obedience for specific behaviors, you are essentially doing exactly the same thing for “obedience” behaviors. This is not logically consistent. If reward is good for detection behaviors, why is it not good for obedience behaviors? Praise is a long distant second compared to a tangible reward. Remember that when you return to your agency and tell them you would prefer to get notes and calls from your superiors with praise for how you are doing your job, and that you will no longer require them to pay you money every two weeks. If you believe that, then keep on petting K9 Fido and keep telling yourself he is much warmer inside for it that he would be if you gave him something to tug on when he recalled to you.

In our police dog programs at Tarheel Canine, we use a reward item for obedience that is different from his drug toy. We use a jute roll for obedience rather than a ball for a number of reasons. The jute roll is not thrown, it is presented and the dog bites it, and you play with him as the reward. I want him to want to see me as the source of all things good, so that when I reward him the jute comes from me (I hold it in the small of my back inside my belt, so he cannot see it), and when we are done playing, it returns to me.

For example, I would be heeling with him, and if he is giving me good attention or is in excellent position, I will reward him. The reward comes out with a signal (bridge, secondary reinforcer) which is a verbal noise like “psssst” and in one motion I reach behind and bring out the jute and present it chest high for him to come up and grip. I play tug with him, and adjust his grip on it (helpful in bite work), and then out him at my discretion. This helps me keep a handle on his release, as I am in a position to correct him if need be for not letting go. He lets go and I can either re-command him to heel or sit and stay as I put the toy back in my belt, or I can reward his out with another grip on the tug before putting it away. This is simple, effective and very efficient. The play enhances the bond while setting limits on his behavior. I start and end the game. He can’t chase the toy and then decide to play the “keep away” game with me like he could if I threw it away from me. I stay firmly in control, but I have now signaled to him that attention and good position will get you something, so it is a good idea to repeat that behavior.

Deconstructing Obedience

If rewarding the dog is a good idea to increase the quality and responsiveness of his obedience, then I need to know when to reward him. The answer is not entirely simple, but suffice it to say that anything I want him to do well should be rewarded. Let’s take the heeling pattern for example. What mini-skills are in a simple heeling pattern? The start (step off into the pattern), position at heel, halts, left turns, right turns, about turns, fast pace, slow pace, normal pace, down and recall to heel, gunfire neutrality, and the figure 8 to name most of the skills. So what I am saying is that in order to develop high quality responses, you must reward the dog for each mini skill often enough that he knows that a good effort will likely result in a reward for the behavior.

You can see that if I hold such a philosophy, I would look with utter disbelief at someone who, after doing a 15 minute obedience routine, at the end, would throw a ball for the dog. Unless you are rewarding the dog for getting to the end of the 15 minutes you will never have back in your life, it does no good at all if this is all you do. However, if you are using a variable reward schedule and sometimes make him wait until the end to give you a string of good skills (routine) before you reward him, you are doing a great job of understanding how to motivate behavior.

The proper approach is to employ a variable reward schedule to the elements of your routine. First, look hard at your dog’s strengths and weaknesses in his heeling. For example, does your dog lag at the start of the heeling pattern, or does he crowd on his left turns, and lag on his right turns? If so you have some things to work on, and you need to come up with a training plan to address the issues.

Example: Getting a better start to the heeling pattern. Come to the start line and ask the dog to get into heel position. As soon as he takes heel, and I mean as soon as his but hits the ground, give him a quick, energetic reward bite on the jute. Give the verbal bridge and then present the tug in a fluid motion. Tug with him, play with him, let him enjoy it, and then out him, and hide the tug, and then and quickly ask for heel again. He should come to heel faster in anticipation of the game. Tell him heel, and step off forward quickly. If he comes off the line fast as you step off reward him again. Start over again and reward him three steps into the pattern, and then start over and do 5 steps in, and then end with a reward at the very beginning just as you did the first time. Then put him back in the car. You are going to do about five, five minute sessions, instead of one twenty five minute session. As he improves your sessions will be somewhat longer and you will do fewer repetitions.

Reward Placement

Notice in the above example, in the mini-session described, you would be rewarding the dog for getting into heel position. Many dogs are never rewarded at the start, so they slowly take heel position, perhaps sniffing a little, or going in slow motion, because they anticipate, not getting rewarded, but rather a series of unpleasant jerks for the next 15 minutes. Don’t you procrastinate tasks that you find unpleasant? So does your dog.

How you place your rewards is critical to maximizing performance. You will strategically place a reward for the first step of the heeling pattern, and then a few paces into the pattern, and then a few more paces in, and then back at the start. This variable placement increases your dog’s drive to get the reward, as your dog thinks it is equally likely to come at any point, and he will stay fast and focused if he believes there is a good likelihood to get the reward. You can elongate the time between rewards as his attitude and behavior improve to what your standard is, and then you can return to numerous rewards again.

Creating a standard of behavior is critical in good obedience training. If you set an expectation, and hold your dog to it with both correction when he performs below standard and reward when he meets or exceeds the standard, you will likely get a performance that meets or exceeds your standard each time you ask for obedience. But you as the handler must set that standard and stick to it, with both reward and enforcement of the standard where appropriate. Allowing behaviors which are below standard (leash pulling is sometimes allowed because the handler is being lazy enforcing the heel command when the dog is running toward the kennel to get fed, for example) will signal to the dog that when there is something he really wants to do instead of obedience, he can get away with it. So don’t get mad when he decides to blow you off on certification day. You created the permissive atmosphere.

Once you create the standard, ask for it, enforce it clearly, and reward the good behaviors variably. This looks like more work than, say, just jerking on the leash when he gets out of position, after all the reward process does take time to get the toy out, and play with him. I would rather take that time, than have a lifetime nag-war with my dog, or see him go low in drive when we do obedience, or worse see him get over on me when I need his obedience most.

Example: Rewarding Turns. Suppose your dog’s heeling is nice and clean on the straight legs, but he lags on his right turns all the time. First, you must figure out why he is lagging. Is it because you are speeding away from him when you turn to the right, taking him by surprise and leaving him in the dust? If so you need to work on your footwork. Stepping off hard right and jerking the leash will induce an opposition response, and you are likely to create more lagging than you solve. Good footwork that allows the dog to see where you are going and a little focus from being rewarded while heeling will help keep your dog with you through a turn. But even better, reward him for the change in direction. If he stays with you upon executing a right turn, pop out his jute roll and reward him. Do it often. Set up a pattern with 5 right turns only. After the first turn, if he is with you up to your standard, reward him. If he lags on the first turn then pull out the jute and just tease him (“look what you could have had if you were up here!”) and put it away, then do another turn, if he is with you, reward him, if lagged, tease again. If your dog is driven for a toy, you will see him want to get in a position to get the toy, and the lagging problem will go away. The key is to reward the skills individually and often. Then vary your rewards as the general proficiency of the skills increases. Don’t expect to solve a problem you have created over months or years in one session either.

Variable Reward

Suppose now that your dog is heeling the straight-away legs well and is now driving for the reward on the right turns. To keep the level of performance high, you must variably reward each of the skills in the heeling pattern. You do not need to reward every step or every turn, but rather, reward strategically.

Example 3: The heeling pattern. Now that you have developed each of the individual skills you identified by deconstructing the heeling pattern, you can now string the behaviors together and variably reward the dog’s heeling pattern. You call your dog to heel position, and reward at the start with a nice tug session. You out the dog, and call into heel again, tug tucked inside your belt in the small of your back. You then heel forward. Evaluate your dog. Assume he is in good position and showing some attention and you go for a number of paces while his enthusiasm is still high, and then turn to the right, and BAM! Give a reward on the first right turn. Play and out the dog. Back to heel position, and BAM! Another reward is given, tug and quick out this time. Back to heel position and forward again, and while on the straight away, pull out your tug, and turn in a fast circle right, make him miss it, and put it back behind you and continue straight. Frustration will increase his drive, and you make a right turn which he drives into, and you continue straight for another 15 paces, and BAM! Another reward is given. Play and tug, and let him enjoy, and then out the dog. Get the dog back to heel. Go forward and now do your change of pace. You have already individually worked on rewarding each pace transition from fast to slow pace with reward separately, and then slow back to normal, or whatever transitions your certification asks for. Now within the context of a heeling pattern, reward one of the pace transitions. If you step off into a fast heel from a normal pace, and he stays with you, BAM! A reward is given. Play and restart. Go straight again, and go fast pace, and if you see he drives to keep with you, recalling the last reward, and then shift to slow. You look for the quality of response. Does he slow himself without leash help? If so, then BAM! Reward that transition!

I think you get the picture. When you deconstruct a larger exercise, you work the components, and then put it together into a sequence of behavior and make sure you variably reward the skills you developed individually. It takes some pre-planning, but the outcome of an obedient, fast, flashy, and most of all compliant dog is well worth it.

Balancing with Compulsion

I am not suggesting a completely motivational approach here. What I am suggesting is a balanced approach. Set a standard of behavior. Expect it, and do what is necessary to get it every time, no matter what the context. Whether you developed behaviors motivationally from the start or used a compulsive method to train the obedience it doesn’t matter. From today forward let him know there is light at the end of the compliance tunnel. Reward signals to the dog there is something in it for him.

Do obedience especially when the dog doesn’t expect you to enforce it. Do it when you stop your car to do a bathroom break, put the leash and collar on the dog, and when he is done, call him to heel position. If he comes back fast, reward that dog! If he ignores you, correct the disobedience, and when he comes into position reward the dog for complying, perhaps with the jute roll to let him know that even if you force the behavior, compliance gets rewarded. Get your dogs thinking about complying because it is in their best interest to do so. Dogs will repeat behaviors that are successful.

If the dog is willfully disobedient, and a leash correction is warranted, correct him! But then let him know reward is available, if he complies. Give a dog the following choice: comply and be rewarded often (variably), or don’t comply and get corrected (not nagged). Dogs will work toward compliance and reward. However, if you are dealing with competing motivations, like trying to do obedience during bite work, you may experience more trouble. But again, if you deconstruct this problem, you will see that you can enforce obedience, and send the dog to bite as a reward, so that obedience around massive distraction can be turned into high level compliance.

Many police dogs as well as competition dogs completely lose any obedience when they are around a decoy in a bite suit. I am here to tell you that there are civilians, who train dogs as a hobby, who can control their dogs off leash with decoys running around cracking a whip or a clatter stick. Decoys can stand on either side of a series of jumps and the dog will jump each obstacle without taking a bite. During an obedience recall exercise, decoys can run along side the dog and he will not bite but come to the handler. The dogs are put in stays, and decoys jump around the dog making noise. Sound difficult? It is, but it is doable.

This is because the dog has been taught to show control around the distraction until the command comes to take the grip, and thus their reward for showing the restraint. The dog is made to behave to the standard, and the key is rewarding that compliance.

I have a personal dog that was returned to Tarheel Canine from a police K9 class because he lit up the handler in patrol class. I took the dog back, and decided to keep him for my personal competition dog. When I got him back, he was out of control around a decoy. He was pretty obedient away from that competing motivation. I developed a lot of his obedience away from that distraction, but I knew at some time we would have to cross the threshold of decoy distraction. So I took him to bite work training and started with a small but firm expectation: Look at me and you can go bite. At first he fought to go to the decoy, and I kept putting him back in a sit, telling him to look at me (a behavior I developed away from distraction). It took a bunch of tries, where he just outright broke toward the decoy (I had him on leash of course), and then he came up on me out of frustration which I handled by holding him up for a few seconds and then putting him back in a sit (I didn’t take that personally so I didn’t make a big deal over his acting out) and then after he exhausted a lot of non-compliant behaviors he finally looked at me and I immediately sent him for a bite. That cause and effect was not lost on him, as each time we did this it took less and less time to get the attention. I then variably rewarded that behavior by asking him to look at me a little longer, and then went back to immediate rewards, and then dragged out the attention longer, and went back to more immediate rewards. In two weeks I was heeling with attention around a decoy. Now he asks for permission to go bite, and I don’t have such a fight on my hands. When we go near a decoy he looks at me himself, without prompting. Yes, occasionally he gets a wild hair and I have to enforce my will, but I have a clear standard, and I never waver. He is more and more seeing things my way. Once I own his eyes and I own his behavior. But the real power is in the reward. He wants to bite, and the path to what he wants is obedience to me. It is a simple notion, but a powerful one.

Obedience is everywhere.

As I proceed with this series of articles, you will see that obedience is everywhere. Not “jerk the leash” obedience but rather clear standards, followed by a reward and correction structure to optimize learning and get the results we desire. Every aspect of dog training is getting obedience to task, whether that task is searching for drugs, or hunting on a trail, or releasing a suspect once apprehended. Learning how to develop behaviors through a balance of reward and thoughtful compulsion will bring the best results. Reward is the most powerful tool we have, learn to use it to your benefit.